Co-authored with Claude (Opus 4.6). I brought the thesis. Claude brought the structure. We argued about the middle until it was right.


Every major AI provider will tell you the same thing: “We don’t train on your data.”

Anthropic says it. OpenAI says it. Microsoft says it. At the enterprise API tier, with the right pricing plan, you get a contractual commitment that your prompts and responses won’t be fed back into model training. And I believe them — they almost certainly honor those commitments.

But honoring a commitment is not the same as being unable to violate it. It’s policy, not architecture. It’s a promise made by a company that can technically access everything you send it, choosing not to. And the distance between “chooses not to” and “cannot” is the most important gap in enterprise AI right now.

One is a promise. The other is a wall. Almost nobody has the wall.

The exposure most people aren’t thinking about

Meanwhile, the vast majority of actual AI usage in enterprises isn’t happening through governed API tiers at all. It’s employees pasting strategy documents into consumer-grade chat interfaces because they’re useful and nobody told them not to. The productivity gains are real. So is the exposure.

But even the organizations that have locked down their AI usage to enterprise tiers are missing something: the queries themselves are intelligence.

If a pharmaceutical company runs drug interaction simulations through a cloud LLM, the model provider doesn’t just see the data — they see what the company is investigating, what hypotheses they’re testing, what competitive angles they’re exploring. The metadata of inquiry reveals strategic intent as clearly as the documents themselves.

Agent traces make it worse. The full chain of reasoning, tool calls, and intermediate results that agentic AI systems generate — those expose entire workflow patterns and decision-making processes. You’re not just leaking a document. You’re leaking how you think.

And there’s a legal accelerant: emerging case law is calling into question whether attorney-client privilege survives when communications are processed through public cloud AI endpoints. If that line of reasoning holds — and it’s trending that direction — it transforms data leakage from a competitive risk into a legal liability. A law firm using ChatGPT to analyze privileged client documents may be creating discoverable communications. The privilege may be waived by the act of sending the data through a third party’s infrastructure.

Extend that across HIPAA, FERPA, financial regulations, and classified government work. Every sector that handles protected data is exposed — not because the providers are malicious, but because the architecture allows access even when the policy says otherwise.

The on-prem swing

Chamath Palihapitiya laid this out on the All-In Podcast recently: enterprises will be forced to bring AI infrastructure back on-premises. The cost is real, but if AI delivers the productivity savings everyone expects, the math balances. You stop paying cloud providers to host your AI workloads, and you stop routing proprietary data through someone else’s infrastructure.

The concept is straightforward. High-powered inference servers running on-prem, built to handle your organization’s AI workload, distributed to users via policy. Attached to the model provider through a one-way pipeline — the latest model weights and capabilities flow down from the provider to your local infrastructure. Your prompts, your data, your agent traces never flow up. The provider licenses the model and the update feed. The org gets frontier-class capability. The data boundary is physical — nothing leaves the building.

That’s a wall, not a promise. And Dell, HPE, and Nvidia’s enterprise GPU business are positioned for exactly this shift.

But here’s where the pure on-prem thesis misses something: the model providers can’t afford to let it happen unchecked.

The provider’s dilemma

Anthropic, OpenAI, Google, and Microsoft have spent billions building datacenter capacity on the assumption that enterprise AI consumption would flow through their metered endpoints forever. A mass on-prem migration breaks that business model. They’re not going to sit still.

The most likely adaptation is something I’d call isolated tenancy — and it could change the entire market.

Picture this: a provider allocates datacenter capacity to a fundamentally different architecture. Isolated, single-tenant environments where each enterprise customer gets a walled-off partition of compute. The enterprise gets access to the latest models, ongoing updates, the full power of the provider’s infrastructure — but with an architectural guarantee that the provider itself cannot access the data within each tenant’s partition. No prompt logging. No response metadata flowing back to training pipelines. No agent traces leaving the tenant boundary.

The cloud equivalent of a safety deposit box. The bank provides the vault and the security. But even the bank can’t open your box.

This would let providers preserve subscription revenue while giving enterprises the data isolation they need. It would make frontier AI accessible to school districts, regional hospitals, and smaller law firms that will never have the capital or expertise for full on-prem deployment. On paper, it’s the perfect middle ground.

To be fair, the providers have started moving in this direction. Microsoft’s Azure Confidential Computing, Azure OpenAI’s data residency options, and similar offerings from Google and AWS represent early steps toward isolated tenancy. But the current implementations stop well short of what’s described above — they offer stronger contractual guarantees and some hardware-level isolation, but they don’t solve the fundamental problem of provider access during inference. The gap between what exists today and what regulated enterprises actually need is where the real opportunity lives.

Except there’s one problem with closing that gap.

The hardest unsolved problem in enterprise AI

Making tenant isolation provable — not just contractual — may be the single biggest hurdle in this entire thesis.

The wish list looks clean on paper:

  • Hardware-level isolation — dedicated compute, not shared GPU pools with logical separation
  • Customer-held encryption keys — the provider never possesses keys to decrypt prompt or response data
  • Verifiable audit trails — third-party audit capability, not self-reported compliance
  • Legal structure limiting compelled disclosure — you can’t hand over what you don’t have

But there’s a fundamental tension in item two that the industry hasn’t solved: the model has to decrypt your data in memory to run inference on it. Customer-held encryption keys protect data at rest and in transit — but during the actual moment of inference, your prompt is plaintext in GPU memory on the provider’s hardware. Confidential computing with trusted execution environments (TEEs) exists, but current implementations have known side-channel vulnerabilities and significant performance overhead. Fully homomorphic encryption (FHE) — computing on encrypted data without ever decrypting it — is the theoretical endgame, but current FHE implementations are orders of magnitude too slow for LLM inference. This isn’t a gap that gets closed with a software update. It may be an inherent limitation of the architecture.

Which brings us right back to where we started. “We don’t train on your data” is a contractual assurance, not a verifiable technical guarantee. Even isolated tenancy — the most ambitious proposed solution — can’t fully close the gap at inference time. You are trusting a company’s policy, not its architecture. Promise, not wall.

The provider that actually solves this — truly solves it, not just markets it — gains a massive competitive advantage in the regulated-enterprise market. But anyone telling you it’s solved today is selling you a promise dressed up as a wall.

The four-tier architecture

So what do you actually do? The practical answer isn’t a binary choice between cloud and on-prem. It’s a tiered architecture that routes workloads based on data sensitivity — and is honest about the limitations of each tier.

Tier 1 — Air-gapped local inference. The most sensitive stuff. Student records, patient data, attorney-client privileged communications, classified government work. Local LLMs on organization-controlled hardware with no external connectivity for inference. One-way model updates only. This is the only tier that provides a true wall. It’s also the most expensive and operationally demanding. Non-negotiable for organizations with statutory data protection obligations.

Tier 2 — Isolated tenancy. Sensitive workloads that benefit from frontier-class models, but where the organization can’t afford full on-prem. Provider-hosted compute in a dedicated partition with the strongest available isolation — hardware separation, customer-held keys, third-party audits. This tier is the most important and the least mature. It’s where the industry needs to go. It’s also where the inference-time decryption problem lives. Organizations using Tier 2 should understand they are accepting a residual risk that the provider could access data during processing, even if contractually prohibited from doing so.

Tier 3 — Private cloud with BAA. Semi-sensitive workloads where contractual data processing agreements provide sufficient protection. Azure OpenAI with business associate agreements. The provider can technically access the data but is contractually bound not to. This is where most governed enterprise AI usage lives today. It is entirely promise-based. For many workloads, that’s an acceptable risk. For protected data, it shouldn’t be.

Tier 4 — Public endpoints. General research, content drafting, code assistance on open-source projects. Standard cloud subscriptions. No proprietary data in the prompt.

The critical capability nobody has yet is the routing layer — the organizational intelligence that classifies data and directs it to the appropriate tier. That’s where policy meets architecture, and it’s where most organizations have zero infrastructure today.

Where this leaves us

The gap between promises and walls is where the real work in enterprise AI is happening — whether most organizations realize it or not.

Tier 1 requires capital, expertise, and operational maturity that most organizations don’t have. Tier 2 requires the providers to solve an inference-time isolation problem that may not have a clean solution. And in the meantime, the entire industry is running on Tier 3 at best — contractual promises from providers who can technically see everything you send them, asking you to trust that they won’t look.

That’s not a reason to stop using AI. The productivity gains are real and the organizations that refuse to adopt will fall behind. But it is a reason to start building toward the tiers that offer real architectural guarantees instead of pretending that a contractual promise is the same thing as a wall.

The organizations that start building toward Tier 1 and Tier 2 now — even imperfectly — will have a structural advantage over those that wake up after the compliance audit or the data breach forces their hand. And the provider that actually solves provably isolated tenancy gains a massive advantage in the regulated-enterprise market that’s currently paralyzed between “we need AI” and “we can’t expose our data.”

Start by knowing what tier you’re on. Most of you are on Tier 3 or Tier 4 and don’t realize it. That’s the first wall to build — not a technical one, but an awareness one.

Exit code: 0.


Source material: Analysis synthesized from the All-In Podcast (Chamath Palihapitiya on enterprise data leakage through public AI endpoints) and Dan Ives’ analysis of derivative AI infrastructure beneficiaries. The tiered architecture and promises-vs-walls framework are original analysis.