Open Data Philosophy
What open data means in the context of AI.
Open data in the context of artificial intelligence means more than publishing a dataset once a year. It means that every table, every field, every piece of information that a platform collects about its users is documented, accessible, and understandable — in real time, by the people it belongs to. Future AI was built around this principle from its first line of code. The schema documented on this page is not a marketing exercise. It is the actual database schema in production, updated whenever a structural change is made.
When people talk about AI transparency, they typically mean explainability — the ability to understand why a model produces a particular output. That matters. But there is a more fundamental layer of transparency that is almost universally ignored by AI platforms: data transparency. Before you can understand what an AI knows, you need to understand what data was used to build it, what data it continues to collect, and what happens to that data at every stage of the pipeline. This page answers all three questions in plain language.
The Future AI data pipeline works as follows. When a user sends a message in the workspace, that message is stored in a private messages table that only the user can read. The AI generates a response, which is stored alongside the original message. The server counts the words in the exchange. If the cumulative word count crosses the 10,000-word threshold, one Gold Coin is credited to the user's account and logged in the coin_transactions table. The question-and-answer pair is then anonymised — all user identifiers are removed — and stored in the knowledge_base table, which is readable by anyone.
This architecture was designed to satisfy the highest standard of data minimisation under the General Data Protection Regulation. The platform collects only what is necessary to deliver the service: a profile with a display name and email address, a record of conversations and messages, a coin balance and transaction history, and a list of unlocked features. No location data. No behavioural fingerprinting. No advertising profiles. No data enrichment from third-party brokers.
The open-data commitment extends to the knowledge base itself. Most AI companies treat their training data as a proprietary asset — a competitive moat that justifies keeping it secret. Future AI takes the opposite view. The knowledge base that the AI uses to answer questions was built from the intellectual contributions of real users. Those contributions should be readable by the community that made them. Making the knowledge base publicly accessible means that errors can be identified and corrected by anyone, not just by the engineers who built the system. It is a more robust and more trustworthy approach to AI knowledge management.
Every entry in the knowledge base is tagged with a topic, ranked by community confirmations, and searchable by keyword. The full knowledge base is available at /knowledge-base and is updated continuously. Researchers, journalists, and curious users are welcome to read it, cite it, and build on it. No API key required. No rate limits on public reads. That is what open data looks like in practice.