OpenRouter vs Direct API Keys: Why I Switched

Published February 23, 2026Last updated April 13, 2026

I built Cumbersome, an iOS and Mac AI client that talks to APIs from the leading providers all day long. I see the good, the bad, and the ugly from direct providers as well as the newer multi-providers like Vercel AI Gateway and OpenRouter. I recently adjusted the "recommended" provider in the app Settings. It says it all.

You have a lot of options in Cumbersome to add direct AI API keys, as well as "multi-providers." After a long in-the-trenches, we now solidly recommend folks use OpenRouter directly.

I also published a separate review of Vercel AI Gateway. Both are legitimate gateways and I use both daily. But after months of building against every major provider's raw API, I think most people managing their own API keys should route everything through OpenRouter as their default gateway.

The 5.5% platform fee is worth what you get.

Why Use a Multi-Provider Gateway at All

If you are reading this, you probably already manage your own API keys for OpenAI, Anthropic, or Google AI Studio. You know the advantages over subscriptions: pay-per-use pricing, model control, no subscription trap.

But here is what I have noticed after months of daily use. The AI model landscape is a rotating cast. Last month GPT-5.2 was my default for most tasks. This month Claude Sonnet 4.6 handles certain work better. Kimi K2.5 showed up recently and it handles high-volume workloads at a fraction of the cost. The model comparison I wrote a few weeks ago is already partially outdated because new models keep shipping.

Models are becoming commodities. SOTA changes week to week. If you are like me, you are constantly swapping between providers as new models drop and benchmarks shift. That means managing separate API keys for each provider, separate credit balances, separate usage dashboards, and separate billing cycles.

A multi-provider gateway gives you one API key that covers all of them. You get a single balance, a single dashboard, and a single place to manage spending. Cumbersome already makes it easy to switch between providers mid-conversation, but using a gateway underneath means you configure one key and access everything.

Why OpenRouter Specifically

I have used both OpenRouter (opens in new tab) and Vercel AI Gateway. Both are legitimate gateways. But OpenRouter wins on the features that matter in daily use:

Zero Data Retention with no extra charge. Covers OpenAI, Anthropic, and more. No enterprise agreement required.
Per-key spending limits with daily reset. Cap your exposure if a key leaks or a bug runs away.
Guardrails and provider restrictions. Control which models and providers each key can access.
Unified web search across providers. One integration, every model family (with caveats).
Standardized thinking and reasoning. Extended thinking works across OpenAI and Anthropic through a single code path.
Developer-first features. Presets, auto-routing, zero completion insurance, and crypto payments.
One dashboard for all spending. Every provider, every model, one view.

Zero Data Retention with No Extra Charge

This is the big one.

OpenRouter offers Zero Data Retention (opens in new tab) across a broad set of endpoints, including OpenAI. When I first wrote this post, Vercel AI Gateway did not cover OpenAI for ZDR. That has since changed: Vercel now covers OpenAI, Anthropic, and Google. So the coverage gap is largely closed.

The difference now is how each gateway charges for it. OpenRouter includes ZDR in the 5.5% credit purchase fee. Vercel charges $0.10 per 1,000 requests for team-wide ZDR enforcement (per-request is free, but you have to wire it yourself). That means OpenRouter's ZDR story is simpler: flip the toggle, done, no usage-based charge on top.

Stored Data Is Discoverable Data

Most providers say they won't train on your API data and will not store it beyond some retention period (often 30 days, sometimes longer). That sounds reasonable until you think about what "stored for 30 days" actually means in practice.

In December 2025, a federal judge ordered OpenAI to hand over 20 million ChatGPT chat logs (opens in new tab) to the New York Times in a copyright lawsuit. OpenAI fought to keep them secret and lost. The logs existed because OpenAI stores them. "We delete after 30 days" does not protect you when a court orders preservation before those 30 days are up. Stored data is discoverable data, regardless of what the privacy policy promises.

Zero Data Retention is fundamentally different. ZDR means no storage beyond the brief in-memory caching needed to process your request. There's nothing to subpoena because nothing was ever saved. ZDR is "we never stored it," not "we will delete it soon."

You could get ZDR from providers directly, but that usually requires an enterprise agreement with legal review and enough API volume to justify the special treatment. Consumer apps and direct API usage without that deal still mean 30 days or more of retention. Enterprise ZDR agreements require the kind of leverage that individual developers and small teams do not have. OpenRouter negotiated these agreements on your behalf. You enable ZDR once at the account level and every request routes only to compliant endpoints. In Cumbersome, you set your OpenRouter key, enable ZDR in your OpenRouter account, and it just works.

OpenRouter's ZDR toggle. Enable it once at the account level and every request from Cumbersome routes to zero data retention endpoints only. No enterprise agreement required.

Your Prompts Are Not as Private as You Think

"What do you have to hide?" is the wrong question. The risk is not just about what you are doing with AI today. It's about what happens to stored data tomorrow. Police are already obtaining warrants for reverse keyword searches (opens in new tab), asking Google to reveal everyone who searched for specific terms in a given time window. Courts are upholding this practice. It's not a stretch to imagine the same approach applied to AI prompts. You searched for information about a topic that later became part of an investigation. Your "deleted" data turns out to have been preserved on a backup somewhere. Now you are explaining yourself.

And it's not just law enforcement. Courts are ruling that AI prompts are discoverable in litigation (opens in new tab) and that conversations with consumer AI tools are not protected by attorney-client privilege. A federal judge in the Southern District of New York held this month that a defendant's AI-generated documents were neither privileged nor work product, in part because the AI provider's privacy policy reserved the right to collect inputs and share data with third parties. If the provider stores your prompts, those prompts can be subpoenaed, discovered, and used against you.

Beyond legal exposure, stored data is a target. Breaches happen. Provider databases get compromised. Data ends up on the dark web where adversarial actors can profile you, craft targeted scams, impersonate you, or use your own words to social-engineer access to your accounts. The less data that exists about your AI usage, the smaller your attack surface. ZDR does not just protect your privacy from the provider. It eliminates an entire category of risk.

Per-Key Spending Limits with Daily Reset

If you have ever worried about an API key leaking (or a client app going haywire and burning through credits), OpenRouter has a practical answer. When you create an API key, you can set a credit limit that resets on a schedule: daily, weekly, or monthly.

Set a $5 daily limit. If the key gets compromised or a bug sends a runaway loop of requests, the damage caps at $5 before the key stops working. Next day, it resets and you are back to normal.

Per-key spending limits with daily reset. If a key leaks, the damage is capped.

Guardrails go further. You can restrict specific keys to specific models, set budget caps, and control which providers a key can access. This matters when you are working with expensive models. Some reasoning models cost over $100 per million output tokens. An accidental loop against one of those gets expensive fast. Guardrails put a ceiling on it.

Guardrails restrict which models a key can access, set budget caps, and control routing. A firewall for your AI spending.

Lock Down Your Providers

Guardrails also let you lock down which providers handle your requests. OpenRouter does not always call the model owner directly. It routes through third-party providers: some are gold standard (Azure, Google Vertex, Amazon Bedrock) with SOC 2 and trust centers; others are smaller and harder to verify. ZDR only means something if the provider on the other end actually follows through. With open-source models (DeepSeek, Qwen, etc.), anyone can host. Commercial models go through vetted providers; open-weight models can be served by anyone. Lock down your provider list. OpenRouter lets you do this at the account level under Provider Restrictions, or per guardrail. I selected only US-based, large, SOC 2-compliant providers with published trust centers.

My provider allowlist on OpenRouter.

This list is separate from ZDR. The ZDR toggle routes to zero-retention endpoints regardless. When I use models that do not offer ZDR (some open-source or newer releases), I still want trusted providers. This allowlist does that. Feel free to crib from it. SOC 2 can feel like process theater, but I want providers that have documented controls and something to lose if they cut corners. Net effect: requests go to US providers with demonstrated data-handling practices, and lower latency if you are stateside.

Vercel AI Gateway does not offer provider restrictions. You can filter the model catalog by provider, but you can't control which providers actually handle your requests. When a gateway routes through dozens of providers (including Alibaba Cloud, ByteDance, and others), "zero data retention" is only as trustworthy as the least trustworthy provider in the chain. This is one of the clearest advantages OpenRouter has over Vercel for privacy-conscious users.

How ZDR Routes Through Providers

For top commercial models like GPT-5.2 and Claude Sonnet 4.6, ZDR through OpenRouter is often not directly through OpenAI or Anthropic themselves. OpenRouter does not appear to have established direct ZDR agreements with those companies. Instead, ZDR routing for these models typically goes through Azure or Google Vertex, where OpenRouter has ZDR agreements in place. This is another reason locking down your provider list matters: you want to make sure ZDR requests land on providers that actually have those agreements, not on a third party that might be serving the model without one.

Web Search That Works Across Providers (With a Big Caveat)

This is a developer concern that translates into a user benefit, but the user experience has real limitations you should know about.

The good part.

AI providers all implement web search differently. OpenAI has their own tool calling format. Anthropic has theirs. Google has Grounding. Each requires different code paths, different parameter handling, and different response parsing. OpenRouter standardized this with their web search plugin (opens in new tab). It works consistently across model families. I added support for it in Cumbersome and it worked on the first try. When I tried implementing Perplexity web search through Vercel AI Gateway, I had to revert it because of compatibility issues with different provider APIs.

OpenRouter also gives you two search engine options. For OpenAI, Anthropic, Perplexity, and xAI models, it can use their native built-in search. For everything else (or by your choice), it uses Exa (opens in new tab), an independent search engine that combines keyword and embedding-based search. This separation of concerns is genuinely useful. It means models like Kimi K2.5, which have no native web search, get search capabilities through OpenRouter. And you can force Exa even on models that have native search if you want a different perspective on the results.

The not-so-good part.

The way OpenRouter implements search has a fundamental design problem that affects daily use. Whether you choose native search or Exa, OpenRouter forces a search on every request. The model never decides whether a search is needed. A preprocessor runs before the model sees your message, searches based on your most recent text, injects the results into the context, and then hands everything to the model. Every single time.

This defeats the purpose of native search for models that have it. GPT-5.2, Claude, Gemini: these models can decide on their own when a web search would help and when it would not. When you use them through their own APIs, the model sees your message first and only triggers a search if the question warrants one. Through OpenRouter, that judgment is stripped away. The preprocessor searches regardless, which means you are paying for search tokens on every request, including ones where the model would have known search was unnecessary. Ask "tell me a story" and OpenRouter searches the web for "tell me a story" before the model even sees your prompt.

This causes three real problems:

Search always fires, even when it should not. If you are mid-conversation and type a follow-up like "tell me more" or "crazy story," OpenRouter searches the web for those exact words. It does not consider the conversation context. It does not know you are referring to something discussed three messages ago.
The model gets flooded with irrelevant data. Because the search results are injected before the model processes anything, the AI has to integrate a pile of web results that may have nothing to do with what you actually asked. This confuses the model and degrades response quality.
It slows everything down. The web search runs on every request, adding latency even when you do not need fresh information from the web. There's no way for the model to skip the search step.

Here is a concrete example from Cumbersome. I asked GPT-5.2 (via OpenRouter) to summarize the plot of the TV show 56 Days. With search enabled, it nailed it. Then I followed up with "crazy story" (meaning the plot I just read about). In the first screenshot, search is off for the follow-up. The AI correctly continues the conversation about 56 Days, calling it a "body found, who did it" setup with identity deception.

Search off for the follow-up. The AI stays on topic and discusses the 56 Days plot.

In the second screenshot, I used Cumbersome's "replay from here" to test search-on from the same conversation point. OpenRouter's preprocessor sees "crazy story," searches the web for those words, and injects results about King Von's rap single "Crazy Story." The model dutifully summarizes the song instead of continuing the conversation.

Search on for the follow-up. The preprocessor searches for "crazy story" out of context, finds a rap song, and the AI runs with it. The conversation about 56 Days is gone.

This is not a rare edge case. It happens any time a follow-up message is short or ambiguous. The search preprocessor has no awareness of conversation history. It treats every message as a standalone query.

The fix in Cumbersome.

I built a feature called "Only Search When Requested" that solves this. When enabled, web search only activates when your message explicitly mentions "search" or "crawl." Follow-up messages like "tell me more" or "crazy story" go straight to the model without triggering a web search. You get the full power of OpenRouter's unified search when you want it, and clean model responses when you do not.

"Only Search When Requested" in Cumbersome. Web search stays available but only fires when you explicitly ask for it. No more King Von interrupting your TV show conversations.

OpenAI and Anthropic are already agentic: their native tool-calling search lets the model decide when a web search adds value. Oddly, OpenRouter seems to bypass that. It either forces Exa when you want native search, or forces native providers to search on every request instead of letting the model choose. Until OpenRouter's plugin approach supports true agentic search, keyword triggering in Cumbersome is a practical workaround that eliminates the worst failure mode.

This is still one of OpenRouter's stronger features relative to managing provider APIs yourself. The unified search interface saved me weeks of integration work. And with Cumbersome's keyword triggering on top, you get the benefits without the noise.

Thinking and Reasoning Across Providers

API standardization is where OpenRouter consistently delivers. Extended thinking (sometimes called "reasoning") is another case where every provider does things differently. OpenAI's o3, Anthropic's Claude with extended thinking, and other reasoning models each have different APIs for how they stream thinking content.

OpenRouter standardized this too. I added reasoning support for OpenRouter in Cumbersome and it works across OpenAI and Anthropic reasoning models through a single code path. Supporting each provider's native thinking API separately is significantly more complex, and it means users wait longer for new reasoning models to be supported.

With OpenRouter handling the API differences, Cumbersome users get thinking and reasoning support across more models, faster.

The rough edges.

OpenRouter's reasoning abstraction works great until you try to turn it off. We hit a bug where Claude Opus 4.6 kept returning <thinking> tags even with AI Reasoning toggled off. We were sending reasoning: { effort: "none" }, which disables reasoning for OpenAI models. It does nothing for Anthropic.

The issue: effort is only defined for OpenAI and Grok. Anthropic models need reasoning: { enabled: false }, which OpenRouter maps to the native thinking: { type: "disabled" }. The docs do not make this clear. We now detect the anthropic/ prefix and send the right payload for each model family.

If your "off" signal is being ignored, check that you are using enabled: false for Anthropic models and effort: "none" for everything else. The unified API is convenient until it silently fails.

One Dashboard for All Spending

Instead of checking OpenAI's usage page, then Anthropic's, then Google's, you see everything in one place: spend by model, request counts, and token usage over time.

All spending, requests, and token usage across every provider in one view. This month: $3.59 across Kimi K2.5, GPT-5.2, and Claude Sonnet.

The Cost: 5.5% Platform Fee

OpenRouter charges a 5.5% service fee on top of provider token prices. That's transparent and visible when you purchase credits.

$100 in credits costs $105.50. The 5.5% is the price of ZDR, spending limits, guardrails, and a unified API.

For context: if you spend $10/month on AI (typical for many API key users), the OpenRouter fee is 55 cents. Less than a dollar for ZDR, spending limits, and a consolidated dashboard. At $100/month, it is $5.50.

Vercel AI Gateway is cheaper at roughly 3.2% in payment processing fees and now covers OpenAI for ZDR. But it charges extra for team-wide ZDR enforcement, lacks spending limits and guardrails, and the API standardization is not as mature for features like web search and reasoning.

Is 5.5% nothing? No. For someone spending $500/month, that's $27.50. But consider what you get: ZDR that would otherwise require an enterprise contract, automatic spending caps that protect you from runaway costs, and an API layer that handles the growing complexity of multi-provider integration. For most individual users and small teams, the math works.

A Dedicated AI Gateway vs a Bolted-On Feature

One thing that has become clearer to me over months of use: OpenRouter is exclusively focused on being an AI gateway. That's the entire company. Model routing, provider aggregation, ZDR, developer tooling. It's the one thing they do.

Vercel, by contrast, is a Next.js hosting platform that added an AI gateway as a product extension. AI Gateway is exciting and getting better fast (the team is clearly paying attention, as I wrote in my Vercel review). But comparative advantage matters. Vercel's core business is deploying web applications. AI routing is a feature, not the foundation.

That distinction shows up in the details. OpenRouter ships features I haven't seen anywhere else.

Presets

Presets (opens in new tab) let you save named configurations that bundle model selection, provider routing preferences, system prompts, temperature, and other parameters into a reusable package. You create a preset in the web UI and reference it in API requests by name. No code changes to swap configurations. This is particularly useful if you manage multiple use cases (a coding assistant, a writing helper, a classifier) and want each to have different model and parameter defaults without polluting your codebase with configuration logic.

Auto Router

Auto Router (opens in new tab) analyzes your prompt and selects the best model automatically. Instead of choosing a model yourself, you send to openrouter/auto and the router (powered by NotDiamond) picks from a curated set based on prompt complexity and task type. The response tells you which model was selected. You can restrict the model pool with wildcard patterns (anthropic/*, openai/gpt-5*) so the router only picks from models you trust. No extra fee beyond the selected model's normal rate.

I haven't used Auto Router in production because I like choosing my models deliberately. But for apps where users send unpredictable prompts and you want the right model without building your own routing logic, this is a compelling feature.

Zero Completion Insurance

Zero Completion Insurance (opens in new tab) is simple and should be standard everywhere: if a model returns zero output tokens with a blank or error finish reason, you don't get charged. Provider-side failures do not cost you credits. This is enabled automatically for all accounts. Vercel does not offer this.

Model Fusion (Experiments)

This one caught my eye because it's exactly the concept behind Cumbersome's Face/Off Mode.

Model Fusion: run multiple models side-by-side, analyze their strengths, fuse the best answer. Sound familiar? This is essentially what Face/Off Mode does in Cumbersome, but at the gateway level. It's not yet available via API, but the fact that OpenRouter is experimenting here tells you something about where they are headed.

These features do not exist in a vacuum. They signal a company that's investing in AI routing as a first-class product, not a side project. Vercel is a great hosting platform. OpenRouter is a great AI gateway. When I am picking a gateway, I want the company where the gateway is the business.

Crypto Payments and Privacy

OpenRouter accepts cryptocurrency (USDC) for credit purchases. This matters more than it sounds.

The "Use crypto" toggle on OpenRouter's credits page. Fund your account without handing over a credit card tied to your real name and billing address.

If you care about ZDR because you don't want your prompts stored, you should also care about whether your payment method creates a paper trail linking your identity to your AI usage. A credit card purchase ties your real name, billing address, and bank to every dollar you spend on inference.

OpenRouter's USDC option (at a 5% fee) lets you fund your account without that connection. Vercel only accepts credit cards. For privacy-conscious users, this is a meaningful gap.

The ZDR Question I Cannot Answer

Here is something that bothers me. ZDR is the single most important reason I recommend gateways over direct provider APIs. It's the feature that justifies the fee.

OpenRouter's pricing page does not mention ZDR at all.

The full OpenRouter pricing page. Platform fees, models, rate limits, support tiers. No mention of ZDR. The feature that matters most to me is invisible on the page that matters most to new users.

I can think of three explanations:

Most users do not care about ZDR. Possible. Most people using AI APIs are building products, not worrying about personal prompt privacy. If the majority of users never enable ZDR, it makes sense not to feature it prominently.
They might start charging for it. Vercel already charges for team-wide ZDR ($0.10 per 1,000 requests). If OpenRouter decides ZDR is a premium feature, the pricing page is where that change would land. The absence of ZDR from the pricing page means there's no public commitment to keep it free.
It's just a documentation gap. The feature exists, the docs cover it, but the marketing page has not caught up.

I don't know which one it is. For now, ZDR is included with no extra charge. That's one of OpenRouter's strongest selling points. I just wish it were more visible.

Why You Can Trust Me on This

I built Cumbersome, an iOS and Mac app that connects directly to AI provider APIs. Every day I am in the trenches integrating with OpenAI, Anthropic, Google AI Studio, OpenRouter, and Vercel AI Gateway. I debug streaming responses, implement new model features, and test across providers. It's my full-time job to understand how these APIs actually work in practice.

I have no relationship with OpenRouter. They don't pay me. I don't get a referral fee. I also published a detailed review of Vercel AI Gateway and I use both daily. My recommendation for OpenRouter comes from months of daily use, not a vendor relationship.

OpenRouter is the better default gateway for most people. The free ZDR, spending limits, guardrails, dedicated focus, and API standardization are worth the 5.5%.

How to Set It Up in Cumbersome

Create an OpenRouter account at openrouter.ai (opens in new tab).
Purchase credits or start with their free tier.
Create an API key with a spending limit. I recommend a daily cap as a safety net.
Enable ZDR in your privacy settings (opens in new tab) if you want zero data retention.
Add the key in Cumbersome under Settings. It sits alongside your direct provider keys.

That's it. One key, hundreds of models, ZDR, and spending limits. You can still keep your direct OpenAI and Anthropic keys configured for comparison or for the rare case where you want the absolute cheapest token cost. But the OpenRouter key covers everything.

The Bottom Line

The case for using your own API keys instead of AI subscriptions has not changed. Pay per use, pick your model, keep your data private.

What has changed is the gateway landscape. Vercel AI Gateway has improved significantly (ZDR now covers OpenAI, the observability is strong, and BYOK support is useful). But OpenRouter remains my default recommendation for a simple reason: it's the company where AI gateway

is the entire business. That focus shows up in features no one else has (presets, auto-routing, zero completion insurance, model fusion experiments), in ZDR that's included with no extra charge, in crypto payments for privacy, and in the spending controls that protect you from runaway costs.

The 5.5% fee buys you all of that. For the handful of cases where you need the absolute cheapest token cost and nothing else matters, keep a direct key. For everything else, OpenRouter is the better default.

Try It

Cumbersome is free for iPhone, iPad, and Mac. Add your OpenRouter key and you have access to hundreds of models with one key. Enable ZDR for privacy. Set spending limits for peace of mind. You pay the providers (plus 5.5%), not us.

Bless up! 🙏✨