Skip to content

009 - BYOK via Vercel AI Gateway

Status

Accepted (v1)

Date

2026-04-21

Context

Organization plan tiers cap monthly AI token usage. Some orgs need to exceed those caps (high-volume workloads) or bypass them entirely (enterprise customers paying providers directly). At the same time, the platform must keep full usage visibility for support, abuse detection, and future billing changes.

The AI module already routes all model traffic through the Vercel AI Gateway. The gateway natively supports Bring Your Own Key in two shapes:

  1. providerOptions.gateway.byok — forwards a customer-supplied credential to the underlying provider (Anthropic, OpenAI, Azure, Vertex, Bedrock).
  2. createGateway({ apiKey }) — swaps the gateway API key itself, used when the customer has their own Vercel AI Gateway team.

Three design axes were evaluated:

  • Integration layer: gateway-native BYOK vs per-provider SDK clients.
  • Binding granularity: one key per agent vs one key per (agent, provider).
  • Fallback behavior: fail-hard vs silent fallback to the system key.

Decision

Integration layer — Gateway-native BYOK

Use providerOptions.gateway.byok for Anthropic, OpenAI, Azure, Vertex, and Bedrock. Use createGateway({ apiKey }) for the vercel-ai-gateway provider slug where the customer supplies their own gateway key.

Rejected: per-provider SDK clients (one AnthropicClient, one OpenAIClient, etc.). That path would require re-implementing the model abstraction the AI SDK already gives us, and would drift out of sync with the gateway's model registry.

Binding granularity — One key per agent

Each agent has at most one bound apiKeyId. If an agent is reconfigured to a model whose provider does not match, the bind is rejected until the operator picks a compatible key or switches to a vercel-ai-gateway key.

Rejected: one key per (agent, provider) binding. Provider changes for an agent are rare, and the extra UX bloat (a binding matrix per agent) does not justify the flexibility in v1.

Fallback behavior — Fail-hard

Requests built for a BYOK-bound agent pass gateway.only pointing at the customer credential. If the credential is invalid, expired, or rate-limited, the request fails. The system never silently falls back to the platform's system key.

Rejected: silent fallback to the system key. That would charge the platform for traffic the customer intended to pay for, and would mask credential issues until the plan cap was breached.

Usage tracking — Source differentiator

Every TokenUsageLog includes source in {'system', 'byok'} and, for BYOK entries, the apiKeyId used. The org_token_usage_summary aggregate keys on (orgId, month, source). Plan-limit enforcement filters on source='system' so BYOK usage is always allowed, but remains visible in usage dashboards as a separate breakdown.

Consequences

Positive

  • Orgs can exceed or bypass plan token caps by supplying their own credits.
  • Full usage tracking is preserved — BYOK is visible, auditable, and reportable exactly like system usage.
  • No silent fallback means credential issues surface immediately and platform credits cannot be accidentally consumed by BYOK traffic.
  • Gateway-native BYOK keeps the AI SDK abstraction intact; adding a new provider is a config change rather than a new client implementation.

Negative

  • Deleting a bound key requires unbinding it from each agent first (AI.API_KEY_IN_USE returns the blocking agentIds).
  • In-place credential rotation is not supported in v1 — operators must delete and recreate. This was accepted because rotation events are rare and the delete-recreate flow is well-defined.
  • Master-key rotation for the AES-256-GCM wrapping key is not tooled in v1.

Revisit Conditions

Reopen when any of the following is true:

  • An org requests per-provider keys on a single agent (multi-model agent that spans providers with different BYOK keys).
  • The number of rotation events per month exceeds ~10 per org and operators report that delete-and-recreate is a burden.
  • A provider ships a BYOK mechanism incompatible with the gateway's providerOptions.gateway.byok surface.