AI Usage Dashboard

The AI Usage Dashboard gives platform administrators a unified view of their organization's AI token consumption, costs, and efficiency metrics across all agents, models, and action types.

Overview

The dashboard aggregates data from token_usage_logs and surfaces it through seven sections:

Global filters — scope all sections to a time window, token source, and granularity.
KPI cards — four top-level summary numbers (cost, tokens, cache hit rate, projection).
Trend chart — dual-axis USD + tokens time series with projection overlay.
Action breakdown — stacked area + donut split by the 5 action types.
Models — top-10 bar chart (stacked by token type) + cost-vs-latency scatter.
Agents table — per-agent ranking by cost with sparklines and CSV export.
Advanced section (collapsible) — heatmap, system vs BYOK cost split, cache savings.

Accessing the Dashboard

Open the Panel at your organization's subdomain.
Navigate to IA in the left sidebar.
Select the Consumo tab.

The tab is visible to any user with the ai.agent read policy. It is scoped to the active organization (x-org-id).

Global Filters

The filter bar at the top of the dashboard scopes all charts and cards simultaneously.

Time presets

Preset	Window
7d	Last 7 days
30d	Last 30 days
90d	Last 90 days
YTD	January 1 of the current year to today
Custom	Date-range picker (any ISO-8601 range)

Source filter

Value	Description
All	System tokens + BYOK tokens combined (default).
Sistema	Platform tokens only (billed against the plan).
BYOK	Tokens from org-supplied provider keys.

Granularity

When set to Auto (default), the backend derives the bucket width from the window length:

Window	Auto granularity
≤ 7 days	Hour
7 – 90 days	Day
90 – 365 days	Week
> 365 days	Month

The granularity selector allows manual override to any of hour, day, week, month.

KPI Cards

Four summary cards appear at the top of the dashboard.

Total USD

Cumulative cost in US dollars for all AI activity in the selected window. Sourced from GET /ai/org-usage/summary.

Total Tokens

Total token count (input + output + cache-read) consumed in the window, formatted with compact suffixes (e.g., 1.4M, 820K). Sourced from GET /ai/org-usage/summary.

Cache Hit Rate

Percentage of input tokens served from the prompt cache, plus an estimated savings figure in USD.

Hit rate = totalCacheReadTokens / totalInputTokens (as a percentage).
Estimated savings = SUM(cacheReadTokens × (input_price − cache_read_price)).

Sourced from GET /ai/org-usage/cache-savings. The card shows "N/A" when no cache data is available. If some models lack pricing data, the savings figure is partial (skippedModelsCount > 0).

Projection

Month-end cost forecast using ordinary least-squares regression on daily spend.

Displays the projected total alongside the 80% prediction interval bounds.
Requires at least 7 daily data points in the selected window. When fewer are available, the card shows "Datos insuficientes" instead of a number.
Sourced from GET /ai/org-usage/projection.

Charts

Trend Chart

A dual-axis time series chart:

Left axis (USD) — cost rendered as a line.
Right axis (tokens) — token volume rendered as a shaded area.
Dashed projection overlay — extends the cost line to the end of the month when status: ok.

The time resolution matches the active granularity setting. Sourced from GET /ai/org-usage/time-series.

Action Breakdown

Displays consumption split across the five action types:

Action type	Description
`user_interaction`	Tokens consumed by direct chat messages.
`guardrail_check`	Sub-agent calls for content safety checks.
`title_generation`	Sub-agent calls for auto-generating chat titles.
`rag_query_embedding`	Embedding model calls for RAG retrieval.
`document_indexing`	Embedding calls during knowledge-base ingestion.

Rendered as a stacked area chart (time series) and a donut chart (aggregate breakdown). Sourced from GET /ai/org-usage/time-series?groupBy=action.

Models

Two complementary views of model consumption:

Bar chart — top 10 models ranked by cost, stacked bars for input vs output token types.

Cost vs Latency scatter — one bubble per model:

X axis: average latency (ms).
Y axis: total cost (USD).
Bubble size: execution count.

Sourced from GET /ai/org-usage/model-efficiency.

Agents Table

A ranked table of agents sorted by total cost in the selected window. Columns:

Column	Description
Agent	Agent name and avatar.
Costo	Total cost (USD).
Tokens	Total tokens.
Ejecuciones	Number of executions.
Sparkline	Mini trend line for the window.

Sortable by any column. Includes a CSV export button that downloads the current view.

Sourced from GET /ai/org-usage/by-agent.

Advanced Section (collapsible)

The advanced section is collapsed by default and contains three subsections.

Heatmap

A 7 × 24 heatmap showing usage intensity by day of week and hour of day.

Columns: days of the week (Sunday = 0 … Saturday = 6).
Rows: hours of the day (0 – 23) in the browser's local timezone.
Color scale: logarithmic, for better visual distribution when usage is concentrated in a few cells.
Cells with zero usage are rendered as empty (the backend omits them; the client fills with zero).

The browser sends Intl.DateTimeFormat().resolvedOptions().timeZone as the timezone query parameter. If the parameter is absent, the backend defaults to UTC.

Sourced from GET /ai/org-usage/heatmap.

System vs BYOK

A cost breakdown comparing platform (system) tokens against org-supplied key (BYOK) tokens.

Useful for organizations that use both channels and want to understand cost distribution.
BYOK costs may be 0 or inaccurate when the org's provider does not expose per-request pricing through the Vercel AI Gateway (see Limitations).

Cache Savings

A static summary of prompt-cache savings for the selected window:

Metric	Description
Tokens served from cache	`totalCacheReadTokens`
Hit rate	`totalCacheReadTokens / totalInputTokens` (%)
Estimated savings	USD saved by serving from cache vs re-processing input

Sourced from GET /ai/org-usage/cache-savings (default source: system).

A timeline view of cache savings over time is planned for a future iteration.

Limitations / Known Issues

BYOK cost accuracy: BYOK costs may be 0 or lower than actual when the user's provider does not report per-request pricing back through the Vercel AI Gateway. This is a provider-level limitation, not a platform bug.
Projection minimum data: the projection card requires at least 7 daily data points. A window shorter than 7 days, or a window with sparse data, will display "Datos insuficientes".
KPI cards and multi-month ranges: when the selected range spans multiple months, some KPI cards (specifically the summary endpoint cards) reflect only the starting month. This is a known limitation tracked as a future improvement.
Heatmap color scale: the logarithmic scale can make very low-usage cells look more prominent than expected. This is intentional — it improves legibility when usage is highly concentrated.
Cache savings for BYOK: because BYOK calls go directly to the provider, the platform has no visibility into the provider's cache metrics. The cache-savings endpoint defaults source=system to avoid misleading zeros in the BYOK column.

Architecture Decisions

The key decisions made during this feature's design are documented in:

ADR 010 — AI Usage Dashboard Analytics Endpoints

AI Usage Dashboard ​

Overview ​

Accessing the Dashboard ​

Global Filters ​

Time presets ​

Source filter ​

Granularity ​

KPI Cards ​

Total USD ​

Total Tokens ​

Cache Hit Rate ​

Projection ​

Charts ​

Trend Chart ​

Action Breakdown ​

Models ​

Agents Table ​

Advanced Section (collapsible) ​

Heatmap ​

System vs BYOK ​

Cache Savings ​

Limitations / Known Issues ​

Architecture Decisions ​