Skip to content

Gateway Generation Logging

Every successful LLM call publishes a TokenConsumedEvent. The TokenConsumedHandler persists the basic usage from the AI SDK result, and — when a generationId is available — fetches richer metadata from the Vercel AI Gateway REST API (GET /v1/generation?id={id}) to enrich the log before persisting.

What Gets Logged

The ai.token_usage_logs table gains three new columns from this feature:

ColumnTypeDescription
generation_idTEXTVercel AI Gateway generation ID, extracted from providerMetadata.gateway.generationId
latency_msINTEGERInference latency in milliseconds, as reported by the gateway
gateway_infoJSONBFull raw generation object from the gateway REST API, forward-compatible via .passthrough()

A partial index on generation_id WHERE generation_id IS NOT NULL keeps lookups efficient while avoiding index overhead for rows without a gateway ID (for example, embedding requests and other non-chat actions).

IGatewayGenerationInfoClient Port

The fetch logic is abstracted behind a hexagonal port so tests can inject a stub and the adapter can be swapped without touching business logic:

typescript
// apps/api/src/modules/ai/application/ports/gateway-generation-info-client.interface.ts

export const GATEWAY_GENERATION_INFO_CLIENT = Symbol('IGatewayGenerationInfoClient');

export interface IGatewayGenerationInfoClient {
  getGenerationInfo(params: {
    generationId: string;
    apiKey: string;
  }): Promise<GatewayGenerationInfo | null>;
}

VercelGatewayGenerationInfoClient is the production adapter. It calls https://ai-gateway.vercel.sh/v1/generation?id={id} with a Bearer token, validates the response body with the GatewayGenerationInfoSchema (Zod 4, .passthrough()), and implements exponential backoff with four total attempts and delays of 500 ms, 1500 ms, and 3000 ms before giving up.

BYOK Cost Rule

When a BYOK-bound agent (provider vercel-ai-gateway) calls the gateway, the cost reported by the gateway has two fields:

  • total_cost — includes any Vercel AI Gateway markup
  • upstream_inference_cost — the raw provider cost with no platform markup

The platform always records the cost the organization actually pays:

BYOK bindingRecorded costReason
vercel-ai-gatewayupstream_inference_costOrg pays the provider directly; Vercel markup is irrelevant
Any other / nonetotal_costPlatform is billed by Vercel; full gateway cost applies

This rule is applied in TokenConsumedHandler.buildLog() and requires the handler to resolve the caller's own Vercel gateway API key so the fetch authenticates with the right account.

API Key Resolution for Generation Fetch

To fetch generation metadata the handler must present a valid Vercel AI Gateway API key:

  1. System / non-Vercel BYOK: Use the platform's own gateway key (AppConfigService.ai.gatewayApiKey).
  2. BYOK with vercel-ai-gateway provider: Decrypt the organization's saved credential via ApiKeyCredentialService.decrypt() and use that key. This ensures the fetch hits the correct account where the generation is stored.

Graceful Degradation

All enrichment logic is best-effort. If the gateway client returns null (network error, 404, parse failure), the handler falls back to persisting only the data available in the original TokenConsumedEvent (input/output tokens, cost from the AI SDK result). The log row is always written — the enrichment columns are NULL in the degraded case.

TokenConsumedEvent received

  ├─ generationId present AND feature flag enabled?
  │     YES → fetch gateway info (with retry/backoff)
  │              │
  │              ├─ success → apply BYOK cost rule, map latency + gateway_info
  │              └─ error   → warn + continue with event data only

  └─ NO → persist event data only (generation_id = NULL, latency_ms = NULL)

No exception propagates from fetchGenerationInfo. Any error is logged at WARN level and discarded.

Feature Flag

AI_GATEWAY_HYDRATE_GENERATION_INFO is a boolean env var (default true) that controls whether the handler calls the gateway REST API at all. Setting it to false (or 0, no, off) disables generation fetching entirely — the event data is persisted as-is. This enables fast rollback without redeploying code.

bash
# Disable generation hydration
AI_GATEWAY_HYDRATE_GENERATION_INFO=false

generationId Propagation Path

The generationId travels from the AI SDK response through several layers before reaching the handler:

VercelAiGatewayAdapter.run() / mapResultToGatewayResponse()
  → extracts providerMetadata.gateway.generationId
  → LlmGatewayGenerateResponse.generationId

DefaultAgentRunner.run()
  → AgentRunResult.generationId

Title / Guardrail sub-agents
  → TokenConsumedEvent.generationId

CreateMessageStreamCommandHandler
  → TokenConsumedEvent.generationId + byokProvider

The byokProvider field on TokenConsumedEvent carries the BYOK provider slug (for example vercel-ai-gateway) so the handler can select the correct API key resolution strategy.

Schema Forward-Compatibility

GatewayGenerationInfoSchema is defined with .passthrough():

typescript
export const GatewayGenerationInfoSchema = z
  .object({
    id: z.string(),
    total_cost: z.number().nullable().optional(),
    upstream_inference_cost: z.number().nullable().optional(),
    is_byok: z.boolean().optional(),
    tokens_prompt: z.number().optional(),
    tokens_completion: z.number().optional(),
    // ... other known fields
  })
  .passthrough();

Unknown fields from future gateway API additions are preserved and stored in the gateway_info JSONB column rather than being stripped. This avoids a deploy requirement when the gateway adds new fields.

Database Migration

The migration at 1745350000000-add-gateway-generation-logging.ts adds the three columns and the partial index to ai.token_usage_logs.

Migrations are run manually

Do not run migrations automatically. The team runs pnpm typeorm migration:run in each environment as part of the deploy process.

down() removes the index first and then drops the three columns in reverse order, which restores the table to its pre-migration state without data loss (the columns are all nullable).

See Also