Skip to content

010 - AI Usage Dashboard Analytics Endpoints

Status

Accepted (v1)

Date

2026-04-23

Context

The AI Usage Dashboard required exposing aggregated token consumption data to the panel frontend. Five new analytical read endpoints were added under /ai/org-usage. Seven design questions arose during implementation.


Decision 1 — 5 separate endpoints vs a single mega endpoint

Decision: implement five focused endpoints (time-series, projection, model-efficiency, heatmap, cache-savings) instead of one endpoint with a type discriminator.

Rationale:

  • Each endpoint has a distinct query parameter set, cache key, and response shape. A single endpoint would require a complex discriminated union in both the Zod schema and the TypeScript types.
  • The panel can load sections independently and in parallel, improving perceived performance.
  • Individual endpoints are easier to cache at different granularities if TTLs need to diverge in the future.
  • Follows the existing pattern in org-usage.controller.ts (summary, daily, by-agent).

Alternative considered: a single GET /ai/org-usage/analytics?type=... endpoint. Rejected because it creates a hidden discriminated union that is harder to validate, document, and cache per-type.


Decision 2 — JS projection math (least squares + prediction interval in service layer)

Decision: implement the cost projection using ordinary least-squares regression with an 80% prediction interval entirely in the application service (projection-calculator.ts in TypeScript), not in the database.

Rationale:

  • The dataset (daily aggregates for up to 365 days, scoped to one org) is small enough that in-process computation is orders of magnitude faster than a PostgreSQL regr_slope query with a round-trip.
  • SQL regression functions are database-specific; moving the math to JS keeps the repository layer generic (PostgreSQL-agnostic aggregation).
  • The 7-point minimum guard (daysAvailable < 7 → insufficient_data) is cleaner to express in application code than in SQL.

Minimum data guard: fewer than 7 daily data points returns { status: "insufficient_data", minDaysRequired: 7, daysAvailable: N }.


Decision 3 — Redis TTL strategy (300 s baseline for all endpoints)

Decision: all five endpoints use a 300-second Redis TTL.

Rationale:

  • Usage data is not real-time; a 5-minute cache window is acceptable for an analytics dashboard.
  • The heatmap data is heavier to compute (SQL EXTRACT(dow) grouping), but its access pattern is similar to the other endpoints — an operator opening the advanced section once per session. A uniform TTL simplifies operations and monitoring.
  • Cache keys include orgId, from, to, source, and endpoint-specific parameters (e.g., timezone, granularity, limit) so different filter combinations do not collide.

Note: the original proposal suggested a longer TTL for the heatmap (600 s or 1800 s). After implementation, a uniform 300 s was chosen for simplicity. This can be raised per-endpoint without a schema change.


Decision 4 — Reuse ai.agent CASL subject for authorization

Decision: all five endpoints use ability.can('read', 'ai.agent') — no new CASL subject is introduced.

Rationale:

  • Usage analytics is a read-only view of AI agent activity. Any user who can read agents should be able to read aggregate usage metrics.
  • Introducing a new subject (e.g., ai.usage) would require new policy seeds and RBAC UI changes with no meaningful security benefit at v1.
  • The ai.agent subject is already org-scoped via @OrgId(), which enforces that the caller can only see their own organization's data.

Decision 5 — JS-interpolated heatmap palette (no new design tokens)

Decision: the heatmap uses a logarithmic color scale interpolated in JavaScript from the existing neutral and brand palette, rather than introducing new design tokens for heat colors.

Rationale:

  • Adding heatmap-specific tokens to the design system requires cross-team coordination and a design review cycle.
  • The existing 5-stop palette (white → brand) is sufficient for a log-scale single-hue heatmap at v1.
  • Easier to iterate: the interpolation function can be updated without a design token PR.

Decision 6 — No React.lazy for the Consumo tab (deferred)

Decision: the AI Usage Dashboard tab is not lazy-loaded — it uses the same eager import pattern as all other tabs in the AI panel.

Rationale:

  • Other panel tabs are not lazy-loaded. Introducing React.lazy for a single tab would create an inconsistent pattern without a clear performance win, as the panel bundle is already code-split at the route level.
  • Deferred for a future iteration when lazy-loading is adopted consistently across the panel.

Decision 7 — Composite indexes declared in TypeORM entity (user runs migration)

Decision: composite indexes needed for the analytics queries (agency_id + created_at, agency_id + model_id + action_type) are declared on the TokenUsageLog TypeORM entity using @Index decorators. The migration is generated and applied by the operator, not automated by the feature.

Rationale:

  • The project convention (see CLAUDE.md) is that migrations are run manually by the developer/operator. The feature code declares the intent; the operator executes it.
  • Declaring indexes in the entity keeps schema intent co-located with the domain model and visible in code review.
  • The analytics queries are read-heavy and would degrade significantly at scale without these indexes.

Consequences

  • Five new endpoints are added to OrgUsageController under /ai/org-usage.
  • All endpoints are cached in Redis with a 300-second TTL per unique parameter combination.
  • The projection card in the frontend shows a sentinel state when fewer than 7 daily points are available.
  • BYOK cost accuracy is limited by provider visibility (documented in the feature guide).
  • No new CASL subjects or design tokens are introduced in v1.
  • Composite indexes must be applied to token_usage_logs before production load to avoid slow queries.