Org Usage Analytics API

Five read-only endpoints that aggregate token consumption data for the AI Usage Dashboard. All paths are under the /ai/org-usage prefix.

Pre-existing endpoints: GET /ai/org-usage/summary, GET /ai/org-usage/daily, and GET /ai/org-usage/by-agent existed before this feature. This document covers only the five new endpoints added with the AI Usage Dashboard change.

Authentication

All endpoints require:

A valid Authorization: Bearer <jwt> session token.
A valid x-org-id header (or the orgId embedded in the JWT).
A CASL policy that allows read on the ai.agent subject.

The JWT's agencyId claim must match the organization making the request. Mismatches return 403 Forbidden.

Common query parameters

The following parameters appear on all five endpoints:

Parameter	Type	Default	Description
`from`	ISO-8601 datetime	required	Start of the analysis window (inclusive).
`to`	ISO-8601 datetime	required	End of the analysis window (exclusive). `from` must be earlier than `to`.
`source`	`system` \| `byok` \| `all`	`all` (or `system` for cache-savings)	Which token pool to include. `system` = platform tokens; `byok` = org-supplied provider keys; `all` = both.

GET /ai/org-usage/time-series

Returns a time-bucketed series of token consumption and cost, with an optional breakdown by action type or model.

Query parameters

Parameter	Type	Default	Constraints	Description
`from`	ISO-8601 datetime	required	—	Window start.
`to`	ISO-8601 datetime	required	—	Window end.
`source`	`system` \| `byok` \| `all`	`all`	—	Token pool.
`granularity`	`hour` \| `day` \| `week` \| `month`	auto	optional	Bucket width. When omitted, derived automatically: ≤7 d → `hour`; 7–90 d → `day`; 90–365 d → `week`; >365 d → `month`.
`groupBy`	`action` \| `model` \| `none`	`none`	—	Breakdown dimension. When `none`, the `series` map on each bucket is absent.
`timezone`	IANA timezone string	`UTC`	—	Timezone for bucket alignment. The frontend sends `Intl.DateTimeFormat().resolvedOptions().timeZone`.

Response — 200 OK

json

{
  "granularity": "day",
  "timezone": "America/Mexico_City",
  "buckets": [
    {
      "bucket": "2026-04-01T00:00:00.000Z",
      "totalTokens": 142500,
      "totalCostUsd": 0.427,
      "series": {
        "user_interaction": { "totalTokens": 95000, "totalCostUsd": 0.285 },
        "guardrail_check":  { "totalTokens": 47500, "totalCostUsd": 0.142 }
      }
    }
  ]
}

Field	Type	Description
`granularity`	string	Effective bucket width (may differ from requested when auto-derived).
`timezone`	string	IANA zone used for alignment (echoes the request value).
`buckets`	array	Ordered list of time buckets, oldest first.
`buckets[].bucket`	ISO-8601 datetime	Bucket start timestamp.
`buckets[].totalTokens`	integer ≥ 0	Total tokens across all sources in this bucket.
`buckets[].totalCostUsd`	number ≥ 0	Total cost in USD.
`buckets[].series`	record (optional)	Present when `groupBy` is not `none`. Keys are dimension values (action names or model IDs).

Cache policy

Redis key: ai:org-usage:time-series:{orgId}:{from}:{to}:{source}:{groupBy}:{timezone}:{granularity}. TTL: 300 s.

Errors

Status	When
400	Invalid query parameters (e.g., `from` ≥ `to`, unknown `granularity`).
401	Missing or invalid JWT.
403	Caller does not have `read` on `ai.agent`.

Example

bash

curl -X GET \
  "https://api.daramex.app/ai/org-usage/time-series?from=2026-04-01T00:00:00Z&to=2026-04-08T00:00:00Z&granularity=day&groupBy=action&timezone=America/Mexico_City" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/projection

Projects the month-end total cost using ordinary least-squares regression on daily spend. Requires at least 7 daily data points; returns a sentinel when the window is too short.

Query parameters

Parameter	Type	Default	Description
`from`	ISO-8601 datetime	required	Window start.
`to`	ISO-8601 datetime	required	Window end.
`source`	`system` \| `byok` \| `all`	`all`	Token pool.

Response — 200 OK

The response is a discriminated union on the status field.

When projection succeeds (status: "ok"):

json

{
  "status": "ok",
  "actual": 12.50,
  "projected": 38.75,
  "lowerBound": 32.10,
  "upperBound": 45.40,
  "confidencePct": 80
}

Field	Type	Description
`actual`	number	Cumulative cost already incurred in the window (USD).
`projected`	number	Projected end-of-period total (USD).
`lowerBound`	number	Lower bound of the 80% prediction interval (USD).
`upperBound`	number	Upper bound of the 80% prediction interval (USD).
`confidencePct`	`80`	Fixed literal — always 80.

When data is insufficient (status: "insufficient_data"):

json

{
  "status": "insufficient_data",
  "minDaysRequired": 7,
  "daysAvailable": 3
}

Field	Type	Description
`minDaysRequired`	`7`	Fixed minimum number of daily data points required.
`daysAvailable`	integer ≥ 0	Number of distinct days with data in the requested window.

Note: The frontend should render an empty/informational state ("Datos insuficientes") when status is insufficient_data.

Cache policy

Redis key: ai:org-usage:projection:{orgId}:{from}:{to}:{source}. TTL: 300 s.

Errors

Status	When
400	Invalid query parameters.
401	Missing or invalid JWT.
403	Caller does not have `read` on `ai.agent`.

Example

bash

curl -X GET \
  "https://api.daramex.app/ai/org-usage/projection?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&source=all" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/model-efficiency

Returns per-model aggregates (cost, tokens, latency) for the top N models sorted by cost descending. Used to render the models bar chart and cost-vs-latency scatter.

Query parameters

Parameter	Type	Default	Constraints	Description
`from`	ISO-8601 datetime	required	—	Window start.
`to`	ISO-8601 datetime	required	—	Window end.
`source`	`system` \| `byok` \| `all`	`all`	—	Token pool.
`action`	`user_interaction` \| `guardrail_check` \| `title_generation` \| `rag_query_embedding` \| `document_indexing`	—	optional	Filter to a single token action type. When absent, all actions are included.
`limit`	integer	`20`	1–50	Maximum number of models returned.

Response — 200 OK

An array of model entries sorted by totalCostUsd descending.

json

[
  {
    "modelId": "anthropic/claude-sonnet-4",
    "avgLatencyMs": 1240.5,
    "p95LatencyMs": 3800.0,
    "totalCostUsd": 24.75,
    "totalTokens": 8250000,
    "executionsCount": 3410
  }
]

Field	Type	Description
`modelId`	string	Vercel AI Gateway model identifier.
`avgLatencyMs`	number \| null	Average execution latency in ms. `null` when no latency data is available (never `0`).
`p95LatencyMs`	number \| null	95th-percentile latency in ms. `null` when no latency data is available.
`totalCostUsd`	number	Total cost for this model in the window (USD).
`totalTokens`	integer	Total tokens consumed by this model.
`executionsCount`	integer	Number of individual executions (rows in `token_usage_logs`).

Cache policy

Redis key: ai:org-usage:model-efficiency:{orgId}:{from}:{to}:{source}:{action}:{limit}. TTL: 300 s.

Errors

Status	When
400	Invalid query parameters (e.g., `limit` out of range, unknown `action`).
401	Missing or invalid JWT.
403	Caller does not have `read` on `ai.agent`.

Example

bash

curl -X GET \
  "https://api.daramex.app/ai/org-usage/model-efficiency?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&limit=10" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/heatmap

Returns a 7 × 24 heatmap of token consumption by day-of-week and hour-of-day. Cells with zero usage are omitted; the client fills absent positions with zeros.

Query parameters

Parameter	Type	Default	Description
`from`	ISO-8601 datetime	required	Window start.
`to`	ISO-8601 datetime	required	Window end.
`source`	`system` \| `byok` \| `all`	`all`	Token pool.
`timezone`	IANA timezone string	`UTC`	Timezone for hour-of-day bucketing. The frontend sends `Intl.DateTimeFormat().resolvedOptions().timeZone`. Cross-module resolution (e.g. from the Identity module) is intentionally avoided to prevent feature-module coupling.

Response — 200 OK

json

{
  "timezone": "America/Mexico_City",
  "cells": [
    { "dayOfWeek": 1, "hour": 9,  "totalTokens": 51200, "totalCostUsd": 0.154 },
    { "dayOfWeek": 1, "hour": 10, "totalTokens": 72400, "totalCostUsd": 0.217 }
  ]
}

Field	Type	Description
`timezone`	string	IANA zone used for bucketing (echoes the request value; defaults to `UTC`).
`cells`	array	Sparse list of cells with non-zero usage.
`cells[].dayOfWeek`	integer 0–6	Day of week: 0 = Sunday … 6 = Saturday (matches `EXTRACT(dow ...)` SQL semantics).
`cells[].hour`	integer 0–23	Wall-clock hour in the specified timezone.
`cells[].totalTokens`	integer ≥ 0	Total tokens in this (day, hour) cell.
`cells[].totalCostUsd`	number ≥ 0	Total cost in this cell (USD).

Cache policy

Redis key: ai:org-usage:heatmap:{orgId}:{from}:{to}:{source}:{timezone}. TTL: 300 s.

Note: The frontend renders the heatmap with a logarithmic color scale for better visual distribution when usage is concentrated in a few cells.

Errors

Status	When
400	Invalid query parameters.
401	Missing or invalid JWT.
403	Caller does not have `read` on `ai.agent`.

Example

bash

curl -X GET \
  "https://api.daramex.app/ai/org-usage/heatmap?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&timezone=America/Mexico_City" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

GET /ai/org-usage/cache-savings

Returns a summary of prompt-cache savings: how many tokens were served from cache, the estimated cost savings compared to regular input pricing, and the cache hit rate.

Note: This endpoint defaults source to system because BYOK calls go directly to the user's provider and do not produce platform-reported cache metrics.

Query parameters

Parameter	Type	Default	Description
`from`	ISO-8601 datetime	required	Window start.
`to`	ISO-8601 datetime	required	Window end.
`source`	`system` \| `byok` \| `all`	`system`	Token pool. BYOK usage typically yields `0` savings since the platform has no visibility into the provider's cache metrics.

Response — 200 OK

json

{
  "estimatedSavingsUsd": 1.85,
  "hitRate": 0.34,
  "totalCacheReadTokens": 1020000,
  "totalInputTokens": 3000000,
  "skippedModelsCount": 0
}

Field	Type	Description
`estimatedSavingsUsd`	number ≥ 0	`SUM(cacheReadTokens × (input_price − cache_read_price))` across all models with pricing data (USD).
`hitRate`	number [0, 1]	`SUM(cacheReadTokens) / SUM(totalInputTokens)`.
`totalCacheReadTokens`	integer ≥ 0	Total tokens served from the prompt cache in the window.
`totalInputTokens`	integer ≥ 0	Total input tokens (cache reads + regular inputs).
`skippedModelsCount`	integer ≥ 0	Number of distinct models excluded from the savings formula because their pricing row is absent in `gateway_models`.

Graceful degradation: when skippedModelsCount > 0, estimatedSavingsUsd is a partial figure. The frontend does not show an error — it surfaces skippedModelsCount as a footnote (future work).

Cache policy

Redis key: ai:org-usage:cache-savings:{orgId}:{from}:{to}:{source}. TTL: 300 s.

Errors

Status	When
400	Invalid query parameters.
401	Missing or invalid JWT.
403	Caller does not have `read` on `ai.agent`.

Example

bash

curl -X GET \
  "https://api.daramex.app/ai/org-usage/cache-savings?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&source=system" \
  -H "Authorization: Bearer <jwt>" \
  -H "x-org-id: 01927f3e-0000-7000-8000-000000000002"

Org Usage Analytics API ​

Authentication ​

Common query parameters ​

GET /ai/org-usage/time-series ​

Query parameters ​

Response — 200 OK ​

Cache policy ​

Errors ​

Example ​

GET /ai/org-usage/projection ​

Query parameters ​

Response — 200 OK ​

Cache policy ​

Errors ​

Example ​

GET /ai/org-usage/model-efficiency ​

Query parameters ​

Response — 200 OK ​

Cache policy ​

Errors ​

Example ​

GET /ai/org-usage/heatmap ​

Query parameters ​

Response — 200 OK ​

Cache policy ​

Errors ​

Example ​

GET /ai/org-usage/cache-savings ​

Query parameters ​

Response — 200 OK ​

Cache policy ​

Errors ​

Example ​

Org Usage Analytics API

Authentication

Common query parameters

GET /ai/org-usage/time-series

Query parameters

Response — 200 OK

Cache policy

Errors

Example

GET /ai/org-usage/projection

Query parameters

Response — 200 OK

Cache policy

Errors

Example

GET /ai/org-usage/model-efficiency

Query parameters

Response — 200 OK

Cache policy

Errors

Example

GET /ai/org-usage/heatmap

Query parameters

Response — 200 OK

Cache policy

Errors

Example

GET /ai/org-usage/cache-savings

Query parameters

Response — 200 OK

Cache policy

Errors

Example