When the Filter Fails: Why AI Tenant Isolation Cannot Be an Afterthought
I build AI systems for regulated businesses where making things up is not an option. My last two posts covered temperature (controlling how random AI outputs are) and grounding (making sure the system only answers from real data); both got a great response. This one keeps me up at night more than either of those.
If you are building AI on top of client data, the most natural architecture is to put everything in one place and use filters to keep clients apart. Client A queries their data, the filter ensures they only see their records, and the whole thing is simple, efficient, and cost-effective — right up until one bug, one misconfigured filter, or one edge case nobody tested means Client A is looking at Client B's confidential information. In a legal context that is an attorney-client privilege violation; in financial services it triggers reporting obligations, fines, and loss of client trust; in healthcare it is a HIPAA incident with mandatory disclosure and potential penalties. In any multi-client environment, it is a breach of the fundamental promise you made when clients gave you their data.
Microsoft recently confirmed that Copilot had been reading emails marked as confidential for weeks, bypassing the data protection policies that were supposed to prevent exactly that — one code bug, weeks of exposure. The pattern is always the same: filters work in testing, they work in the demo, they work for the first six months, and then someone adds a new feature, refactors a query, or introduces a new data source, and the filter logic breaks in a way that nobody notices until it is too late. The issue is not that filters are poorly built; it is that they are a single layer of defence against a catastrophic outcome, and if that layer fails there is nothing behind it because the data is all in the same place.
The systems I build do not rely on filters for client isolation. Each client's data lives in its own separate database, and there is no filter to misconfigure because the data simply is not co-located. With logical separation — all client data in one database, kept apart by query-level filters — a single failure exposes your entire client base. With physical separation — each client in their own storage — something going wrong in one client's environment cannot affect any other client; the blast radius is one. Physical separation costs more to run, more databases, more infrastructure, more operational overhead, but the risk profile is fundamentally different. You are not relying on every query, every API call, and every edge case correctly applying a filter; you are relying on the fact that the data is not there to leak in the first place.
AI makes this problem worse. Traditional applications query structured data in predictable ways — SQL queries with WHERE clauses, API calls with tenant IDs — and these are well-understood patterns with decades of tooling around them. AI systems use vector databases, semantic search, retrieval-augmented generation, and context windows; the query patterns are less predictable and the data flows are more complex. A prompt that works perfectly for one client might, through semantic similarity, pull in documents from another client's vector store if the data is co-located. This is an architectural consequence of how retrieval-augmented generation works: if two clients' documents are embedded in the same vector space, the similarity search does not respect business boundaries unless you explicitly enforce them, and enforcement means filters, which brings us back to the original problem.
If you are evaluating AI tools that handle sensitive client data, there is one question that cuts through the marketing: does each client get their own separate data environment, or is everyone's data in one place with filters keeping it apart? If the answer is filters, ask what happens when one fails. Listen carefully to the response; if they talk about how robust their filters are, how well-tested, how many layers of protection they have, that tells you the data is co-located and they are relying on software logic to keep it apart. That might be acceptable for low-sensitivity data, but for legal documents, financial records, or healthcare information, it is not.
I am not suggesting that every application needs physical tenant isolation — for consumer products, collaboration tools, or low-sensitivity data, logical separation with well-tested filters is a reasonable trade-off. Where a cross-tenant leak is a regulatory incident, a privilege violation, or a front-page story, though, the economics change; the additional cost of physical isolation is insurance against a category of failure that no amount of filter testing can fully eliminate. I sleep a lot better knowing that a bug in my code cannot expose one client's data to another, and the reason is straightforward: the data is not there to leak.