Skip to content

Org Usage Analytics API

Five read-only endpoints that aggregate token consumption data for the AI Usage Dashboard. All paths are under the /ai/org-usage prefix.

Pre-existing endpoints: GET /ai/org-usage/summary, GET /ai/org-usage/daily, and GET /ai/org-usage/by-agent existed before this feature. This document covers only the five new endpoints added with the AI Usage Dashboard change.

Authentication

All endpoints require:

  • A valid Authorization: Bearer <jwt> session token.
  • A valid x-org-id header (or the orgId embedded in the JWT).
  • A CASL policy that allows read on the ai.agent subject.

The JWT's agencyId claim must match the organization making the request. Mismatches return 403 Forbidden.


Common query parameters

The following parameters appear on all five endpoints:

ParameterTypeDefaultDescription
fromISO-8601 datetimerequiredStart of the analysis window (inclusive).
toISO-8601 datetimerequiredEnd of the analysis window (exclusive). from must be earlier than to.
sourcesystem | byok | allall (or system for cache-savings)Which token pool to include. system = platform tokens; byok = org-supplied provider keys; all = both.

GET /ai/org-usage/time-series

Returns a time-bucketed series of token consumption and cost, with an optional breakdown by action type or model.

Query parameters

ParameterTypeDefaultConstraintsDescription
fromISO-8601 datetimerequiredWindow start.
toISO-8601 datetimerequiredWindow end.
sourcesystem | byok | allallToken pool.
granularityhour | day | week | monthautooptionalBucket width. When omitted, derived automatically: ≤7 d → hour; 7–90 d → day; 90–365 d → week; >365 d → month.
groupByaction | model | nonenoneBreakdown dimension. When none, the series map on each bucket is absent.
timezoneIANA timezone stringUTCTimezone for bucket alignment. The frontend sends Intl.DateTimeFormat().resolvedOptions().timeZone.

Response — 200 OK

json
{
  "granularity": "day",
  "timezone": "America/Mexico_City",
  "buckets": [
    {
      "bucket": "2026-04-01T00:00:00.000Z",
      "totalTokens": 142500,
      "totalCostUsd": 0.427,
      "series": {
        "user_interaction": { "totalTokens": 95000, "totalCostUsd": 0.285 },
        "guardrail_check":  { "totalTokens": 47500, "totalCostUsd": 0.142 }
      }
    }
  ]
}
FieldTypeDescription
granularitystringEffective bucket width (may differ from requested when auto-derived).
timezonestringIANA zone used for alignment (echoes the request value).
bucketsarrayOrdered list of time buckets, oldest first.
buckets[].bucketISO-8601 datetimeBucket start timestamp.
buckets[].totalTokensinteger ≥ 0Total tokens across all sources in this bucket.
buckets[].totalCostUsdnumber ≥ 0Total cost in USD.
buckets[].seriesrecord (optional)Present when groupBy is not none. Keys are dimension values (action names or model IDs).

Cache policy

Redis key: ai:org-usage:time-series:{orgId}:{from}:{to}:{source}:{groupBy}:{timezone}:{granularity}. TTL: 300 s.

Errors

StatusWhen
400Invalid query parameters (e.g., fromto, unknown granularity).
401Missing or invalid JWT.
403Caller does not have read on ai.agent.

Example

bash
curl -X GET \
  "https://api.daramex.app/ai/org-usage/time-series?from=2026-04-01T00:00:00Z&to=2026-04-08T00:00:00Z&granularity=day&groupBy=action&timezone=America/Mexico_City" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/projection

Projects the month-end total cost using ordinary least-squares regression on daily spend. Requires at least 7 daily data points; returns a sentinel when the window is too short.

Query parameters

ParameterTypeDefaultDescription
fromISO-8601 datetimerequiredWindow start.
toISO-8601 datetimerequiredWindow end.
sourcesystem | byok | allallToken pool.

Response — 200 OK

The response is a discriminated union on the status field.

When projection succeeds (status: "ok"):

json
{
  "status": "ok",
  "actual": 12.50,
  "projected": 38.75,
  "lowerBound": 32.10,
  "upperBound": 45.40,
  "confidencePct": 80
}
FieldTypeDescription
actualnumberCumulative cost already incurred in the window (USD).
projectednumberProjected end-of-period total (USD).
lowerBoundnumberLower bound of the 80% prediction interval (USD).
upperBoundnumberUpper bound of the 80% prediction interval (USD).
confidencePct80Fixed literal — always 80.

When data is insufficient (status: "insufficient_data"):

json
{
  "status": "insufficient_data",
  "minDaysRequired": 7,
  "daysAvailable": 3
}
FieldTypeDescription
minDaysRequired7Fixed minimum number of daily data points required.
daysAvailableinteger ≥ 0Number of distinct days with data in the requested window.

Note: The frontend should render an empty/informational state ("Datos insuficientes") when status is insufficient_data.

Cache policy

Redis key: ai:org-usage:projection:{orgId}:{from}:{to}:{source}. TTL: 300 s.

Errors

StatusWhen
400Invalid query parameters.
401Missing or invalid JWT.
403Caller does not have read on ai.agent.

Example

bash
curl -X GET \
  "https://api.daramex.app/ai/org-usage/projection?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&source=all" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/model-efficiency

Returns per-model aggregates (cost, tokens, latency) for the top N models sorted by cost descending. Used to render the models bar chart and cost-vs-latency scatter.

Query parameters

ParameterTypeDefaultConstraintsDescription
fromISO-8601 datetimerequiredWindow start.
toISO-8601 datetimerequiredWindow end.
sourcesystem | byok | allallToken pool.
actionuser_interaction | guardrail_check | title_generation | rag_query_embedding | document_indexingoptionalFilter to a single token action type. When absent, all actions are included.
limitinteger201–50Maximum number of models returned.

Response — 200 OK

An array of model entries sorted by totalCostUsd descending.

json
[
  {
    "modelId": "anthropic/claude-sonnet-4",
    "avgLatencyMs": 1240.5,
    "p95LatencyMs": 3800.0,
    "totalCostUsd": 24.75,
    "totalTokens": 8250000,
    "executionsCount": 3410
  }
]
FieldTypeDescription
modelIdstringVercel AI Gateway model identifier.
avgLatencyMsnumber | nullAverage execution latency in ms. null when no latency data is available (never 0).
p95LatencyMsnumber | null95th-percentile latency in ms. null when no latency data is available.
totalCostUsdnumberTotal cost for this model in the window (USD).
totalTokensintegerTotal tokens consumed by this model.
executionsCountintegerNumber of individual executions (rows in token_usage_logs).

Cache policy

Redis key: ai:org-usage:model-efficiency:{orgId}:{from}:{to}:{source}:{action}:{limit}. TTL: 300 s.

Errors

StatusWhen
400Invalid query parameters (e.g., limit out of range, unknown action).
401Missing or invalid JWT.
403Caller does not have read on ai.agent.

Example

bash
curl -X GET \
  "https://api.daramex.app/ai/org-usage/model-efficiency?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&limit=10" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/heatmap

Returns a 7 × 24 heatmap of token consumption by day-of-week and hour-of-day. Cells with zero usage are omitted; the client fills absent positions with zeros.

Query parameters

ParameterTypeDefaultDescription
fromISO-8601 datetimerequiredWindow start.
toISO-8601 datetimerequiredWindow end.
sourcesystem | byok | allallToken pool.
timezoneIANA timezone stringUTCTimezone for hour-of-day bucketing. The frontend sends Intl.DateTimeFormat().resolvedOptions().timeZone. Cross-module resolution (e.g. from the Identity module) is intentionally avoided to prevent feature-module coupling.

Response — 200 OK

json
{
  "timezone": "America/Mexico_City",
  "cells": [
    { "dayOfWeek": 1, "hour": 9,  "totalTokens": 51200, "totalCostUsd": 0.154 },
    { "dayOfWeek": 1, "hour": 10, "totalTokens": 72400, "totalCostUsd": 0.217 }
  ]
}
FieldTypeDescription
timezonestringIANA zone used for bucketing (echoes the request value; defaults to UTC).
cellsarraySparse list of cells with non-zero usage.
cells[].dayOfWeekinteger 0–6Day of week: 0 = Sunday … 6 = Saturday (matches EXTRACT(dow ...) SQL semantics).
cells[].hourinteger 0–23Wall-clock hour in the specified timezone.
cells[].totalTokensinteger ≥ 0Total tokens in this (day, hour) cell.
cells[].totalCostUsdnumber ≥ 0Total cost in this cell (USD).

Cache policy

Redis key: ai:org-usage:heatmap:{orgId}:{from}:{to}:{source}:{timezone}. TTL: 300 s.

Note: The frontend renders the heatmap with a logarithmic color scale for better visual distribution when usage is concentrated in a few cells.

Errors

StatusWhen
400Invalid query parameters.
401Missing or invalid JWT.
403Caller does not have read on ai.agent.

Example

bash
curl -X GET \
  "https://api.daramex.app/ai/org-usage/heatmap?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&timezone=America/Mexico_City" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/cache-savings

Returns a summary of prompt-cache savings: how many tokens were served from cache, the estimated cost savings compared to regular input pricing, and the cache hit rate.

Note: This endpoint defaults source to system because BYOK calls go directly to the user's provider and do not produce platform-reported cache metrics.

Query parameters

ParameterTypeDefaultDescription
fromISO-8601 datetimerequiredWindow start.
toISO-8601 datetimerequiredWindow end.
sourcesystem | byok | allsystemToken pool. BYOK usage typically yields 0 savings since the platform has no visibility into the provider's cache metrics.

Response — 200 OK

json
{
  "estimatedSavingsUsd": 1.85,
  "hitRate": 0.34,
  "totalCacheReadTokens": 1020000,
  "totalInputTokens": 3000000,
  "skippedModelsCount": 0
}
FieldTypeDescription
estimatedSavingsUsdnumber ≥ 0SUM(cacheReadTokens × (input_price − cache_read_price)) across all models with pricing data (USD).
hitRatenumber [0, 1]SUM(cacheReadTokens) / SUM(totalInputTokens).
totalCacheReadTokensinteger ≥ 0Total tokens served from the prompt cache in the window.
totalInputTokensinteger ≥ 0Total input tokens (cache reads + regular inputs).
skippedModelsCountinteger ≥ 0Number of distinct models excluded from the savings formula because their pricing row is absent in gateway_models.

Graceful degradation: when skippedModelsCount > 0, estimatedSavingsUsd is a partial figure. The frontend does not show an error — it surfaces skippedModelsCount as a footnote (future work).

Cache policy

Redis key: ai:org-usage:cache-savings:{orgId}:{from}:{to}:{source}. TTL: 300 s.

Errors

StatusWhen
400Invalid query parameters.
401Missing or invalid JWT.
403Caller does not have read on ai.agent.

Example

bash
curl -X GET \
  "https://api.daramex.app/ai/org-usage/cache-savings?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&source=system" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"