Skip to content

AI Usage Dashboard

The AI Usage Dashboard gives platform administrators a unified view of their organization's AI token consumption, costs, and efficiency metrics across all agents, models, and action types.

Overview

The dashboard aggregates data from token_usage_logs and surfaces it through seven sections:

  1. Global filters — scope all sections to a time window, token source, and granularity.
  2. KPI cards — four top-level summary numbers (cost, tokens, cache hit rate, projection).
  3. Trend chart — dual-axis USD + tokens time series with projection overlay.
  4. Action breakdown — stacked area + donut split by the 5 action types.
  5. Models — top-10 bar chart (stacked by token type) + cost-vs-latency scatter.
  6. Agents table — per-agent ranking by cost with sparklines and CSV export.
  7. Advanced section (collapsible) — heatmap, system vs BYOK cost split, cache savings.

Accessing the Dashboard

  1. Open the Panel at your organization's subdomain.
  2. Navigate to IA in the left sidebar.
  3. Select the Consumo tab.

The tab is visible to any user with the ai.agent read policy. It is scoped to the active organization (x-org-id).


Global Filters

The filter bar at the top of the dashboard scopes all charts and cards simultaneously.

Time presets

PresetWindow
7dLast 7 days
30dLast 30 days
90dLast 90 days
YTDJanuary 1 of the current year to today
CustomDate-range picker (any ISO-8601 range)

Source filter

ValueDescription
AllSystem tokens + BYOK tokens combined (default).
SistemaPlatform tokens only (billed against the plan).
BYOKTokens from org-supplied provider keys.

Granularity

When set to Auto (default), the backend derives the bucket width from the window length:

WindowAuto granularity
≤ 7 daysHour
7 – 90 daysDay
90 – 365 daysWeek
> 365 daysMonth

The granularity selector allows manual override to any of hour, day, week, month.


KPI Cards

Four summary cards appear at the top of the dashboard.

Total USD

Cumulative cost in US dollars for all AI activity in the selected window. Sourced from GET /ai/org-usage/summary.

Total Tokens

Total token count (input + output + cache-read) consumed in the window, formatted with compact suffixes (e.g., 1.4M, 820K). Sourced from GET /ai/org-usage/summary.

Cache Hit Rate

Percentage of input tokens served from the prompt cache, plus an estimated savings figure in USD.

  • Hit rate = totalCacheReadTokens / totalInputTokens (as a percentage).
  • Estimated savings = SUM(cacheReadTokens × (input_price − cache_read_price)).

Sourced from GET /ai/org-usage/cache-savings. The card shows "N/A" when no cache data is available. If some models lack pricing data, the savings figure is partial (skippedModelsCount > 0).

Projection

Month-end cost forecast using ordinary least-squares regression on daily spend.

  • Displays the projected total alongside the 80% prediction interval bounds.
  • Requires at least 7 daily data points in the selected window. When fewer are available, the card shows "Datos insuficientes" instead of a number.
  • Sourced from GET /ai/org-usage/projection.

Charts

Trend Chart

A dual-axis time series chart:

  • Left axis (USD) — cost rendered as a line.
  • Right axis (tokens) — token volume rendered as a shaded area.
  • Dashed projection overlay — extends the cost line to the end of the month when status: ok.

The time resolution matches the active granularity setting. Sourced from GET /ai/org-usage/time-series.

Action Breakdown

Displays consumption split across the five action types:

Action typeDescription
user_interactionTokens consumed by direct chat messages.
guardrail_checkSub-agent calls for content safety checks.
title_generationSub-agent calls for auto-generating chat titles.
rag_query_embeddingEmbedding model calls for RAG retrieval.
document_indexingEmbedding calls during knowledge-base ingestion.

Rendered as a stacked area chart (time series) and a donut chart (aggregate breakdown). Sourced from GET /ai/org-usage/time-series?groupBy=action.

Models

Two complementary views of model consumption:

Bar chart — top 10 models ranked by cost, stacked bars for input vs output token types.

Cost vs Latency scatter — one bubble per model:

  • X axis: average latency (ms).
  • Y axis: total cost (USD).
  • Bubble size: execution count.

Sourced from GET /ai/org-usage/model-efficiency.


Agents Table

A ranked table of agents sorted by total cost in the selected window. Columns:

ColumnDescription
AgentAgent name and avatar.
CostoTotal cost (USD).
TokensTotal tokens.
EjecucionesNumber of executions.
SparklineMini trend line for the window.

Sortable by any column. Includes a CSV export button that downloads the current view.

Sourced from GET /ai/org-usage/by-agent.


Advanced Section (collapsible)

The advanced section is collapsed by default and contains three subsections.

Heatmap

A 7 × 24 heatmap showing usage intensity by day of week and hour of day.

  • Columns: days of the week (Sunday = 0 … Saturday = 6).
  • Rows: hours of the day (0 – 23) in the browser's local timezone.
  • Color scale: logarithmic, for better visual distribution when usage is concentrated in a few cells.
  • Cells with zero usage are rendered as empty (the backend omits them; the client fills with zero).

The browser sends Intl.DateTimeFormat().resolvedOptions().timeZone as the timezone query parameter. If the parameter is absent, the backend defaults to UTC.

Sourced from GET /ai/org-usage/heatmap.

System vs BYOK

A cost breakdown comparing platform (system) tokens against org-supplied key (BYOK) tokens.

  • Useful for organizations that use both channels and want to understand cost distribution.
  • BYOK costs may be 0 or inaccurate when the org's provider does not expose per-request pricing through the Vercel AI Gateway (see Limitations).

Cache Savings

A static summary of prompt-cache savings for the selected window:

MetricDescription
Tokens served from cachetotalCacheReadTokens
Hit ratetotalCacheReadTokens / totalInputTokens (%)
Estimated savingsUSD saved by serving from cache vs re-processing input

Sourced from GET /ai/org-usage/cache-savings (default source: system).

A timeline view of cache savings over time is planned for a future iteration.


Limitations / Known Issues

  • BYOK cost accuracy: BYOK costs may be 0 or lower than actual when the user's provider does not report per-request pricing back through the Vercel AI Gateway. This is a provider-level limitation, not a platform bug.
  • Projection minimum data: the projection card requires at least 7 daily data points. A window shorter than 7 days, or a window with sparse data, will display "Datos insuficientes".
  • KPI cards and multi-month ranges: when the selected range spans multiple months, some KPI cards (specifically the summary endpoint cards) reflect only the starting month. This is a known limitation tracked as a future improvement.
  • Heatmap color scale: the logarithmic scale can make very low-usage cells look more prominent than expected. This is intentional — it improves legibility when usage is highly concentrated.
  • Cache savings for BYOK: because BYOK calls go directly to the provider, the platform has no visibility into the provider's cache metrics. The cache-savings endpoint defaults source=system to avoid misleading zeros in the BYOK column.

Architecture Decisions

The key decisions made during this feature's design are documented in: