Appearance
010 - AI Usage Dashboard Analytics Endpoints
Status
Accepted (v1)
Date
2026-04-23
Context
The AI Usage Dashboard required exposing aggregated token consumption data to the panel frontend. Five new analytical read endpoints were added under /ai/org-usage. Seven design questions arose during implementation.
Decision 1 — 5 separate endpoints vs a single mega endpoint
Decision: implement five focused endpoints (time-series, projection, model-efficiency, heatmap, cache-savings) instead of one endpoint with a type discriminator.
Rationale:
- Each endpoint has a distinct query parameter set, cache key, and response shape. A single endpoint would require a complex discriminated union in both the Zod schema and the TypeScript types.
- The panel can load sections independently and in parallel, improving perceived performance.
- Individual endpoints are easier to cache at different granularities if TTLs need to diverge in the future.
- Follows the existing pattern in
org-usage.controller.ts(summary,daily,by-agent).
Alternative considered: a single GET /ai/org-usage/analytics?type=... endpoint. Rejected because it creates a hidden discriminated union that is harder to validate, document, and cache per-type.
Decision 2 — JS projection math (least squares + prediction interval in service layer)
Decision: implement the cost projection using ordinary least-squares regression with an 80% prediction interval entirely in the application service (projection-calculator.ts in TypeScript), not in the database.
Rationale:
- The dataset (daily aggregates for up to 365 days, scoped to one org) is small enough that in-process computation is orders of magnitude faster than a PostgreSQL
regr_slopequery with a round-trip. - SQL regression functions are database-specific; moving the math to JS keeps the repository layer generic (PostgreSQL-agnostic aggregation).
- The 7-point minimum guard (
daysAvailable < 7 → insufficient_data) is cleaner to express in application code than in SQL.
Minimum data guard: fewer than 7 daily data points returns { status: "insufficient_data", minDaysRequired: 7, daysAvailable: N }.
Decision 3 — Redis TTL strategy (300 s baseline for all endpoints)
Decision: all five endpoints use a 300-second Redis TTL.
Rationale:
- Usage data is not real-time; a 5-minute cache window is acceptable for an analytics dashboard.
- The heatmap data is heavier to compute (SQL
EXTRACT(dow)grouping), but its access pattern is similar to the other endpoints — an operator opening the advanced section once per session. A uniform TTL simplifies operations and monitoring. - Cache keys include
orgId,from,to,source, and endpoint-specific parameters (e.g.,timezone,granularity,limit) so different filter combinations do not collide.
Note: the original proposal suggested a longer TTL for the heatmap (600 s or 1800 s). After implementation, a uniform 300 s was chosen for simplicity. This can be raised per-endpoint without a schema change.
Decision 4 — Reuse ai.agent CASL subject for authorization
Decision: all five endpoints use ability.can('read', 'ai.agent') — no new CASL subject is introduced.
Rationale:
- Usage analytics is a read-only view of AI agent activity. Any user who can read agents should be able to read aggregate usage metrics.
- Introducing a new subject (e.g.,
ai.usage) would require new policy seeds and RBAC UI changes with no meaningful security benefit at v1. - The
ai.agentsubject is already org-scoped via@OrgId(), which enforces that the caller can only see their own organization's data.
Decision 5 — JS-interpolated heatmap palette (no new design tokens)
Decision: the heatmap uses a logarithmic color scale interpolated in JavaScript from the existing neutral and brand palette, rather than introducing new design tokens for heat colors.
Rationale:
- Adding heatmap-specific tokens to the design system requires cross-team coordination and a design review cycle.
- The existing 5-stop palette (white → brand) is sufficient for a log-scale single-hue heatmap at v1.
- Easier to iterate: the interpolation function can be updated without a design token PR.
Decision 6 — No React.lazy for the Consumo tab (deferred)
Decision: the AI Usage Dashboard tab is not lazy-loaded — it uses the same eager import pattern as all other tabs in the AI panel.
Rationale:
- Other panel tabs are not lazy-loaded. Introducing
React.lazyfor a single tab would create an inconsistent pattern without a clear performance win, as the panel bundle is already code-split at the route level. - Deferred for a future iteration when lazy-loading is adopted consistently across the panel.
Decision 7 — Composite indexes declared in TypeORM entity (user runs migration)
Decision: composite indexes needed for the analytics queries (agency_id + created_at, agency_id + model_id + action_type) are declared on the TokenUsageLog TypeORM entity using @Index decorators. The migration is generated and applied by the operator, not automated by the feature.
Rationale:
- The project convention (see
CLAUDE.md) is that migrations are run manually by the developer/operator. The feature code declares the intent; the operator executes it. - Declaring indexes in the entity keeps schema intent co-located with the domain model and visible in code review.
- The analytics queries are read-heavy and would degrade significantly at scale without these indexes.
Consequences
- Five new endpoints are added to
OrgUsageControllerunder/ai/org-usage. - All endpoints are cached in Redis with a 300-second TTL per unique parameter combination.
- The projection card in the frontend shows a sentinel state when fewer than 7 daily points are available.
- BYOK cost accuracy is limited by provider visibility (documented in the feature guide).
- No new CASL subjects or design tokens are introduced in v1.
- Composite indexes must be applied to
token_usage_logsbefore production load to avoid slow queries.