On June 13, 2026, TechCrunch reported that Anthropic pulled its newly launched Fable 5 and Mythos 5 models worldwide after a US government directive barred access by foreign nationals - including its own non-US-citizen employees. Rather than enforce that restriction per user, Anthropic suspended access to the models broadly. India, named by both Anthropic and OpenAI as their second-largest market, spent the weekend re-litigating whether it can build products on AI that sits outside local control.
The geopolitics are above my pay grade. The stack lesson is not, and it applies whether you ship from Bangalore, Berlin, or Boston.
Problem
The newest model from a leading vendor behaves like infrastructure: you call an endpoint, you get tokens back, you build a product on top. Except it’s someone else’s infrastructure—you don’t control it. Access can disappear not because of a technical outage, but because of a vendor policy change, a new regulation, or a geopolitical decision. Most teams don’t price that scenario into their risk model. That’s exactly what happened: access to the newest generation of models started depending on nationality and a directive issued on a Friday.
If your content workflow, your support automation, or your product’s core feature assumes the latest model from one vendor, you’ve taken a dependency with a kill switch you don’t own and can’t predict.
Context
This matters more the deeper the model sits in your stack. Three rough tiers:
- Cosmetic - AI drafts a description you always edit. Lose access, you lose a convenience. Swap vendors in an afternoon.
- Load-bearing - AI runs a blocking quality gate or generates content you publish with light review. Lose access, your pipeline stalls until you re-tune prompts against a different model.
- Structural - your product is the model output. A cutoff isn’t a degradation, it’s an outage.
The same suspension is a shrug at the first tier and an existential event at the third. Most teams never name which tier they’re in - they just integrate whatever model scored best on a benchmark last quarter and move on.
There’s a second axis the India debate surfaces: the newest commercial model versus a good enough one. Sridhar Vembu of Zoho responded to the suspension by arguing for smaller and open-source models. That’s not patriotism, it’s architecture. A model you can run, fine-tune, and host yourself trades peak capability for the one property a suspended API can’t offer: you decide when it turns off.
That trade isn’t free, and self-hosting isn’t sovereignty for nothing. You take on GPU and inference cost, ops burden, your own update and security cadence, and license terms that may carry their own restrictions. The risk doesn’t vanish - it moves from the model vendor to you as the infrastructure operator. The question is which failure mode you’d rather own: one you can’t predict or influence, or one you can budget and staff for.
Decision
Before wiring any model into a workflow, answer two questions explicitly.
How load-bearing is this? If the honest answer is “structural,” one hosted vendor is a single point of failure dressed up as a feature. Treat it like you’d treat a payment processor - have the migration path scoped before you need it, not after.
Does this task need the newest commercial model? Most production AI work - classification, extraction, reformatting, gate-checking - does not need the absolute newest model. It needs a good enough, stable one. The newest tier is exactly the tier that just got pulled. Pinning your pipeline to “always the latest from vendor X” maximises both capability and exposure; pinning to a stable, swappable baseline gives up a little of the first to remove most of the second.
The portable version is more than a thin interface. Yes, wrap the vendor SDK behind one boundary instead of sprawling it through your codebase, and keep prompts in version control. But portability only makes migration cheaper. Redundancy requires a second provider or deployment that is already provisioned, authenticated, within quota, and reachable through a tested routing path. Self-hosting goes further by moving availability into infrastructure you operate, along with its cost and failure modes.
A fallback only counts if you can prove it: maintain an eval set with explicit quality thresholds, run the second model against it on a schedule, and exercise the production switch before an incident. “Swap in a day” is a claim you earn with a green pipeline and a tested failover path before the primary disappears, not a property the abstraction grants for free.
Consequences
You trade some capability and convenience for more control over how the system fails and recovers. Concretely:
- A slightly worse model that you control beats a slightly better one that controls you - for anything load-bearing. For cosmetic uses, keep chasing the newest models; the blast radius is bounded - you lose a convenience, not a pipeline.
- “Vendor-agnostic” stops being a buzzword and becomes three testable properties: can you migrate without rewriting the product, fail over without improvising credentials and routing, and operate the fallback within its quality and capacity limits?
- Open-weight and self-hosted models move up the shortlist - not because they’re better, but because the cost of not understanding your dependencies just got an unignorable price tag.
The proposals floating around India after the suspension - former Infosys exec Mohandas Pai’s call for a ₹500 billion annual fund, louder backing for sovereign foundation models - are the nation-state version of the same move. They remain proposals, not policy. India’s actual commitment is the ₹103.72 billion IndiaAI Mission. The version you can act on this week is smaller and entirely within reach: know which tier each AI dependency sits in, and never let a structural one ride on an access grant somebody else can revoke on a Friday.