The choice between using a closed-API model and self-hosting an open-weight model is one of the most consequential decisions in modern AI deployment. Discussions of the choice usually centre on cost, latency, and capability. The security and governance implications are at least as important and often less examined. Both options have real advantages; both have real risks. The right choice depends on threat model, regulatory environment, and operational maturity.
A direct comparison.
What "closed API" means
Closed APIs in 2026 include OpenAI (GPT-4o, GPT-5, o3, o4-mini families), Anthropic (Claude 3.5 / 4 family), Google (Gemini 1.5, 2.0, 2.5), and a smaller set of others (Cohere, Mistral closed APIs, AWS Bedrock-hosted variants, Azure OpenAI). The model weights are not provided to the customer; inference happens on the provider’s infrastructure; the customer interacts only through API calls.
The provider handles training, updating, scaling, infrastructure security, and most operational concerns. The customer handles prompt design, API integration, and usage governance.
What "open-weight" means
Open-weight models in 2026 include Meta’s Llama 4 family, Mistral’s open releases, DeepSeek-V3 and R1, Qwen 2.5, Microsoft’s Phi-4, and a long tail of smaller releases. The weights are downloadable; inference can run on the customer’s infrastructure (cloud GPU, on-prem, edge device); the customer is responsible for everything from runtime to scaling to security.
The licensing varies. Some are genuinely open (Apache 2.0, MIT). Some have restrictions (Llama’s "acceptable use policy," DeepSeek’s commercial-use limits in certain configurations). The "open" label has marketing latitude; reading the licence matters.
Security advantages of closed APIs
Several real benefits:
The provider runs the security operation. Frontier-model labs invest substantially in protecting the deployed model from extraction, in monitoring for abuse, in maintaining infrastructure security. Customers inherit those investments without operational cost.
Updates are managed. New model versions, security patches, and safety improvements are deployed by the provider. Customers do not need to manage model lifecycle.
Capability ceiling. The frontier closed models are, on average, more capable than the best open-weight models. The gap has narrowed substantially through 2024-2025 but persists.
Built-in safety mitigations. Closed APIs include safety filtering, content classifiers, and use-policy enforcement. The implementations are not perfect (jailbreaks happen) but they are professional and continuously improved.
Regulatory compliance handled at the model layer. SOC 2 compliance, data-residency options, GDPR processing addenda, and similar are typically available from the major providers. Self-hosters do this work themselves.
Auditability. The provider’s logs and behaviour are auditable through their compliance documentation; the customer’s logs of API calls are auditable through their own infrastructure.
Security disadvantages of closed APIs
Real risks on the other side:
Data flow to a third party. Every API call sends potentially sensitive information to the provider. The contractual protections (no training on customer data by default, encryption in transit, data-residency commitments) are credible at the major providers, but the data flow is real.
Vendor lock-in. Migrating to a different provider means re-validating prompts, re-evaluating outputs, and accepting different behavioural quirks. Multi-vendor architectures help; they are operationally costly.
Provider can change behaviour without notice. Capability changes between model versions, deprecated endpoints, modified safety filters all happen. The customer does not have full control.
Service availability depends on the provider. Outages happen; rate limits apply; provider business decisions can constrain access.
Attack surface includes the provider. A breach at OpenAI, Anthropic, or Google could expose customer data flowing through their systems. The 2023 ChatGPT history-disclosure incident and the periodic API authentication concerns have demonstrated real risks.
Compliance limits. Some industries (defence, certain healthcare contexts, certain government applications) cannot use commercial cloud APIs at all. Closed APIs are typically not options for these.
Security advantages of open-weight models
Self-hosting open-weight models has clear benefits:
Data stays in the customer’s environment. No data flows to a third party; sensitive content never leaves the customer’s infrastructure boundary.
Full control over the model. Versions are pinned by the customer; updates happen on the customer’s schedule; deprecation is impossible because the weights are local.
Auditability of the model itself. With weights in hand, the customer can perform their own evaluations, red-team exercises, and behavioural analyses.
No dependency on provider availability. Inference scales as the customer scales infrastructure; rate limits are self-imposed.
Customisation through fine-tuning. The customer can fine-tune the model on their own data, integrate domain-specific behaviours, and produce a derivative model with specific properties.
Compliance options that closed APIs cannot provide. Air-gapped deployments, classified-network deployments, and similar are possible.
Security disadvantages of open-weight models
The list is substantial:
The customer assumes the entire security burden. Protecting weights from theft, securing the inference infrastructure, monitoring for abuse, managing model lifecycle, all of it.
Safety mitigations are weaker. The closed-API providers invest heavily in alignment training, content filtering, and abuse prevention. Open-weight models inherit some of this through their base training but do not have the live filtering layer. Customer-deployed safety filtering is the customer’s responsibility.
Capability gap. The best open-weight model in 2026 is competitive with mid-tier closed models from 2023-2024. The frontier remains closed-API. For some use cases this matters; for many it does not.
Operational complexity. Running large-model inference at scale is not trivial. GPU infrastructure, batching, quantisation, performance tuning, autoscaling, observability, all require expertise.
The supply chain. The customer is responsible for verifying the integrity of the downloaded model. Verifiable signatures from the publisher (Hugging Face’s model-card signing, repository signatures) help; the customer must check them.
Backdoor risk. Open-weight models could carry deliberately introduced backdoors that activate on specific inputs. Research on this is ongoing; documented cases are rare but not impossible. Closed APIs have similar risk in principle but the providers have stronger incentive to detect and prevent it.
Drift in evaluation. As the model is fine-tuned and adapted, its behaviour drifts. The customer is responsible for re-evaluating; capabilities can degrade in unexpected ways.
License compliance. Open-weight licences have terms, output-attribution requirements, downstream-distribution requirements, restrictions on certain uses. Compliance is the customer’s responsibility.
A pragmatic decision framework
Use closed APIs when:
The use case does not involve sensitive data flowing in user content.
The customer values capability over operational control.
Provider compliance posture meets regulatory requirements (most cases).
The team is small and lacks ML infrastructure expertise.
The application can tolerate API-level dependency and the contractual protections of major providers.
Use self-hosted open-weight models when:
Data residency requires it (regulated industries, government, classified contexts).
Deep customisation through fine-tuning is core to the application.
The application can run on smaller models that match open-weight capability.
The team has infrastructure expertise to operate inference reliably.
Cost considerations at very high volume favour self-hosting (the break-even point is moving as both API prices and inference efficiency improve).
A hybrid pattern is increasingly common: route most queries to a closed API for capability, route sensitive-data queries to a self-hosted model for compliance, and gate the routing decision on data classification. This is the pragmatic compromise.
The longer trend
Open-weight capability is closing the gap with closed-API capability faster than most observers predicted in 2022. DeepSeek-R1’s release in early 2025 was a landmark, credible reasoning capability in a fully open-weight model. The trend is likely to continue.
Closed APIs will retain advantages in absolute frontier capability, integrated safety, and operational simplicity. Open-weight models will retain advantages in control, compliance, and customisation.
The right choice is application-specific. The wrong choice is making this decision purely on capability benchmarks while ignoring the security and governance dimensions. NIST’s AI Risk Management Framework, the EU AI Act compliance documentation, and the OWASP LLM Top 10 all touch on the issue. Reading them before procurement is the right path.
The deepest framing: this is not "open vs closed" as a moral choice, it is a build-vs-buy decision applied to AI infrastructure. The same engineering trade-offs that have always defined that decision apply, with the additional factor that the asset being managed is more complex and more consequential than most software.
