Appearance
Agency Custom Domains
Overview
Agencies on DaraMex are identified by a native subdomain (agency-slug.daramex.org). This feature extends the platform so that agency admins can connect their own branded subdomain — for example, booking.kfc.com — so their end-users book appointments without ever seeing a daramex.org URL.
From the platform's perspective, a custom domain is an additional alias for an existing agency. All existing multi-tenancy mechanics (JWT-embedded agencyId, x-agency-id header, per-agency data isolation) remain unchanged. The only new concern is: at first page load, the frontend must resolve an arbitrary hostname to an agencyId. This is handled by a new lookup path on the existing public endpoint GET /identity/agency/id, backed by a dedicated table and a Redis cache.
Infrastructure TLS is provisioned by the platform operator in Dokploy (see Operator Runbook — Provision TLS for Custom Domain). The application layer is TLS-agnostic: the data model, verification flow, and API are identical regardless of which TLS strategy is used.
v1 constraints:
- One custom domain per agency (hard limit enforced in the service layer)
- Subdomains only — apex domains (
kfc.com) are not supported in v1 - No wildcard custom domains
- TLS provisioning is a manual operator step
End-to-End Flow
1. Agency admin enters subdomain (e.g. booking.kfc.com)
└── POST /identity/agency/custom-domain
└── Validates hostname (RFC 1123, rejects apex/wildcard/reserved)
└── Creates record with status = pending_dns
└── Returns DNS instructions to the admin:
Step 1 (verify ownership):
TXT _daramex-verify.booking.kfc.com → dmx_<64hex>
Step 2 (route traffic):
CNAME booking.kfc.com → proxy.daramex.org
2. Admin adds both DNS records at their DNS provider
(DNS propagation can take minutes to 48 hours)
3. Admin clicks "Verify Now" in the dashboard
└── POST /identity/agency/custom-domain/:domainId/verify
└── API fans out 4 DNS lookups in parallel (Promise.allSettled, 5 s overall budget):
├── TXT at _daramex-verify.booking.kfc.com (ownership token)
├── CNAME at booking.kfc.com (routing record)
├── NS at kfc.com (provider detection)
└── A at booking.kfc.com (conflict / proxy check)
│
├── NS result → DnsProviderDetector → dnsProvider persisted
│
├── TXT missing / mismatch → status = failed, failedReason set
└── TXT ok → CNAME evaluated:
├── CNAME missing → cname_missing
├── CNAME wrong target → cname_wrong_target
├── CNAME correct + CF proxy → cname_proxied
├── CNAME correct + A conflict → conflicting_a
└── All ok → status = verified, Redis cache set (TTL 300 s)
4. Operator provisions TLS in Dokploy
└── (Manual step — see runbook linked above)
Traefik obtains a Let's Encrypt cert for booking.kfc.com automatically
5. End-user visits booking.kfc.com
└── Panel AgencyProvider detects non-daramex hostname
└── GET /identity/agency/id?hostname=booking.kfc.com
├── Redis hit → returns agencyId immediately (<5 ms)
└── Redis miss → DB lookup → cache population → returns agencyId
└── Agency context bootstraps; all existing features work normally
6. Admin removes domain
└── DELETE /identity/agency/custom-domain/:domainId
└── status = removed, removed_at set
Redis key deleted immediately
48-hour cooldown before hostname can be reclaimed by any agencyAPI Endpoints
All endpoints are under the /identity/ prefix and require the agency-admin role on the authenticated agency. JWT authentication is required. The target agency is derived from the session context, so no agencyId path parameter is needed.
| Method | Path | Purpose | Auth |
|---|---|---|---|
POST | /identity/agency/custom-domain | Register a new custom domain; returns record + DNS instructions | Agency admin |
POST | /identity/agency/custom-domain/:domainId/verify | Trigger on-demand DNS verification; rate-limited (5/hr per domain, 10/hr per agency) | Agency admin |
DELETE | /identity/agency/custom-domain/:domainId | Remove domain; triggers cache invalidation and 48-h cooldown | Agency admin |
GET | /identity/agency/custom-domain | Fetch current custom domain record for dashboard display | Agency admin |
GET | /identity/agency/id?hostname=<hostname> | Resolve hostname → agencyId (cache-through; public endpoint, extended from existing ?slug=) | Public |
The GET /identity/agency/id endpoint accepts at most one of ?slug= or ?hostname= (omit both to resolve the default agency). Query validation uses the shared Zod schema getAgencyIdQuerySchema in @repo/schemas; providing both parameters or an invalid hostname shape returns 400 Bad Request with Zod issue details (see API Zod request validation).
Data Model
Table: identity.agency_custom_domains
| Column | Type | Constraints | Notes |
|---|---|---|---|
id | UUID (v7) | PK | Sortable by creation time |
agency_id | UUID | NOT NULL, FK → organizations.id | Cascade delete |
hostname | VARCHAR(253) | NOT NULL | Lowercased, trailing-dot stripped |
status | ENUM | NOT NULL, default pending_dns | See state machine below |
verification_token | VARCHAR(72) | NOT NULL | dmx_ + 64-char hex; generated on creation |
verified_at | TIMESTAMPTZ | NULL | Set when status → verified |
failed_reason | VARCHAR(512) | NULL | Typed enum — see failedReason values below |
dns_provider | VARCHAR(64) | NULL | Detected from NS records; see DnsProvider enum |
removed_at | TIMESTAMPTZ | NULL | Set on soft delete; used for cooldown enforcement |
created_at | TIMESTAMPTZ | NOT NULL, default now() | |
updated_at | TIMESTAMPTZ | NOT NULL, default now() |
Indexes:
| Index | Expression | Where clause | Purpose |
|---|---|---|---|
uq_agency_custom_domain_hostname_active | hostname (UNIQUE) | status <> 'removed' | Prevents two active agencies from claiming the same hostname |
uq_agency_custom_domain_agency_active | agency_id (UNIQUE) | status <> 'removed' | Enforces one-per-agency v1 limit at the DB level |
ix_agency_custom_domain_status | status | — | Status filter queries |
ix_agency_custom_domain_hostname_removed | (hostname, removed_at) | status = 'removed' | Cooldown check queries |
Both uniqueness constraints use partial indexes: removing a domain frees the hostname for re-registration (after the 48-hour cooldown) and allows the same agency to add a new domain.
State Machine
[create]
│
▼
pending_dns ◄──────────────────────────┐
│ │
[verify click — DNS lookup] │
│ │
┌───────────────┴──────────────────┐ │
│ TXT found + token matches │ TXT missing │
▼ ▼ or mismatch │
verified failed ────────────┘
│ │ [retry → pending_dns]
│ [remove] │ [remove]
▼ ▼
removed ◄──────────────────────────────┘
▲
│ [remove while pending_dns]
pending_dnsAllowed transitions:
| From | To | Trigger |
|---|---|---|
pending_dns | verified | Verify click + DNS TXT matches token |
pending_dns | failed | Verify click + DNS missing/mismatch |
pending_dns | removed | Admin clicks Remove |
failed | pending_dns | Admin clicks Retry |
failed | verified | Retry click + DNS TXT matches token |
failed | removed | Admin clicks Remove |
verified | removed | Admin clicks Remove (with confirmation) |
Removed domains are not addressable. Any attempt to transition a removed record returns 404 Not Found.
Error Catalog
All errors follow the standard AppError / Result pattern used throughout the identity module. HTTP status codes map from the error type as shown.
errorCode | HTTP | Meaning |
|---|---|---|
CUSTOM_DOMAIN_INVALID_HOSTNAME | 400 | Input does not pass RFC 1123 validation (empty, bad chars, >253 chars) |
APEX_DOMAIN_NOT_SUPPORTED | 400 | Input is an apex domain (too few dots) — not supported in v1 |
WILDCARD_NOT_SUPPORTED | 400 | Input contains * |
RESERVED_HOSTNAME | 400 | Input matches a reserved platform hostname (daramex.org, localhost, etc.) |
AGENCY_ALREADY_HAS_CUSTOM_DOMAIN | 409 | Agency already has a non-removed custom domain (v1 limit: 1) |
HOSTNAME_ALREADY_REGISTERED | 409 | Hostname is claimed by another agency in a non-removed state |
HOSTNAME_COOLDOWN_ACTIVE | 409 | Hostname was removed less than 48 hours ago; includes retry_after |
CUSTOM_DOMAIN_NOT_FOUND | 404 | Domain record does not exist or is in removed state |
CUSTOM_DOMAIN_INVALID_STATE | 409 | Attempted transition not allowed by the state machine |
CUSTOM_DOMAIN_DNS_LOOKUP_FAILED | 422 | DNS lookup returned timeout, NXDOMAIN, SERVFAIL, or transient error |
CUSTOM_DOMAIN_TOKEN_MISMATCH | 422 | TXT record found but value does not match the stored token |
CUSTOM_DOMAIN_VERIFY_RATE_LIMITED | 429 | Rate limit exceeded (5/hr per domain or 10/hr per agency) |
Invalid query combinations for GET /identity/agency/id (e.g. both slug and hostname, or a malformed hostname) are rejected at the controller with 400 and Zod validation issues — they do not emit a dedicated IDENTITY.* AppError code.
failedReason Enum Values
The failedReason field is a typed enum (Zod z.enum). Any value outside this list is rejected at the schema boundary.
| Value | When set |
|---|---|
missing_txt | TXT record not found (NODATA or NXDOMAIN) |
token_mismatch | TXT record present but value does not match the stored token |
dns_timeout | Any DNS resolver call timed out (overall 5 s budget exceeded) |
dns_error | DNS server error (SERVFAIL or unknown DNS error) |
cname_missing | CNAME record not found after TXT validates |
cname_wrong_target | CNAME exists but points to an unexpected target hostname |
cname_proxied | CNAME target resolves to Cloudflare proxy IPs (suspected orange-cloud) |
conflicting_a | A record exists alongside the CNAME (DNS misconfiguration) |
DNS Provider Detection
The verify command resolves NS records for the root domain (kfc.com from booking.kfc.com) in parallel with the TXT and CNAME lookups. The DnsProviderDetector service matches each nameserver hostname against a regex table and returns the first matching provider slug.
| Provider slug | Nameserver pattern |
|---|---|
cloudflare | *.cloudflare.com |
godaddy | *.domaincontrol.com |
namecheap | *.registrar-servers.com |
route53 | `*.awsdns-N.(com |
digitalocean | *.digitalocean.com |
hostgator | *.hostgator.com |
If no NS record matches, dnsProvider is null. The panel shows a generic tutorial in that case.
Provider detection is best-effort: if the NS lookup times out or errors, dnsProvider is null and verification continues normally. The NS timeout does not trigger dns_timeout as a failedReason.
DTO now Field and Clock-Skew Mitigation
The GET and verify response DTOs include a now: ISO datetime field set to the server's current time at response generation. The panel uses now - updatedAt (both from the server) to determine how long the domain has been in its current state — for example, to show the troubleshooting checklist after 3 minutes in pending_dns.
This avoids client clock-skew and timezone bugs that would occur if the panel used Date.now().
Panel Live Polling
The panel hooks into TanStack Query with conditional polling:
ts
refetchInterval: (query) => {
const status = query.state.data?.status;
if (!status || status === 'verified' || status === 'removed') return false;
return 15_000; // poll every 15s while pending_dns or failed
}| Condition | Behavior |
|---|---|
status === 'verified' | Polling stops immediately |
status === 'removed' | Polling stops immediately |
status === 'pending_dns' or 'failed' | Polls every 15 s |
| Window focus event | Always refetches (any status) |
| After POST verify | Query invalidated → immediate refetch |
GET /identity/agency/custom-domain is not rate-limited. The 5/hr rate limit applies only to POST /:id/verify. Polling 20 times per minute will not return 429.
Caching
Hostname → agencyId resolution is on the critical path of every page load on a custom domain. Redis caches the result to avoid a DB round-trip on each request.
| Key pattern | Value | TTL | Set when | Invalidated when |
|---|---|---|---|---|
agency:host:<hostname> | { id, name } (JSON) | 300 s | Domain transitions to verified (or first cache miss from DB) | Domain transitions away from verified or is removed |
agency:host:<hostname> (negative sentinel) | "__none__" (string) | 30 s | Cache miss + no verified record in DB | — (TTL expiry) |
The negative sentinel (30 s TTL) prevents DB hammering on unresolvable hostnames. A 404 response is served immediately from Redis until the sentinel expires.
Security Model
| Concern | Mitigation |
|---|---|
| Domain hijacking | TXT token = 32 cryptographically random bytes (hex) — unguessable |
| Re-registration race | 48-hour cooldown after removal; pending/failed domains block re-registration by others |
| DNS verification abuse | Rate limit: 5 attempts/hr per domain, 10 attempts/hr per agency |
| Host header injection / SSRF | ?hostname= param is validated against the agency_custom_domains table (verified only); the Host HTTP header is never used for tenant resolution |
| Hostname normalization | Input is lowercased and trailing-dot stripped before any storage or comparison |
| Token comparison timing | VerificationToken.safeEquals wraps crypto.timingSafeEqual with a length guard |
| Apex domain fragility | Apex inputs are rejected at the API layer (see error catalog) |
v1 Limitations
One domain per agency
The application enforces a maximum of one active (non-removed) custom domain per agency. This is an intentional v1 simplification: the database schema supports multiple rows per agency_id, but the service layer rejects a second registration with AGENCY_ALREADY_HAS_CUSTOM_DOMAIN. The limit will be lifted in a future iteration when the UI and business rules are ready for multi-domain management.
Subdomains only — no apex domain support
Apex domains (e.g. kfc.com) are rejected at the API layer. CNAME records are not allowed at the zone apex per RFC 1912; the agency would need an A record pointing to the server IP, which is fragile (IP changes break all apex-configured agencies) and requires provider-specific workarounds (ALIAS, ANAME, Cloudflare CNAME flattening). Apex support is deferred to v2 and is coupled to the TLS strategy decision (see Pending v2 — Apex Domain Support).
No wildcard custom domains
*.kfc.com subdomains require wildcard TLS provisioning per custom root domain and additional routing complexity. No use case was identified for v1.
Manual TLS provisioning by the platform operator
When an agency's domain reaches verified status, the platform operator must manually register the domain in Dokploy so Traefik can provision a Let's Encrypt certificate. This step does not scale beyond ~20 concurrent active domains. See Operator Runbook — Provision TLS for Custom Domain.
No background polling for auto-verification
Verification is on-demand only (the agency admin clicks a button). There is no background job that polls DNS and auto-transitions domains. This is intentional: no job infrastructure exists today, DNS TTLs can be up to 48 hours, and on-demand verification is predictable.
No HTTP file challenge
HTTP-based domain validation (.well-known/acme-challenge/) has a chicken-and-egg problem: the domain must already route to DaraMex before the challenge works, but routing requires verification first. TXT-record challenge has no such dependency.
Pending v2 Items
These decisions were explicitly deferred from v1. The application layer (data model, handlers, API contract) was designed to be infra-agnostic so these can be implemented without changing the core domain logic.
| Topic | Summary |
|---|---|
| TLS automation | Replace manual Dokploy registration with an automated option (Cloudflare proxy, Caddy on-demand TLS, or Dokploy API integration triggered on AgencyCustomDomainVerified event). Revisit when manual provisioning becomes a bottleneck (>10 active domains or delayed registrations). Engram topic: sdd/agency-custom-domains/pending/tls-cloudflare-alternative. |
| Apex domain support | Allow agencies to connect root domains (e.g. kfc.com). Requires relaxing Hostname validation, branching DNS instructions (A record vs CNAME), and is strongly coupled to the TLS automation decision above (Cloudflare flattening simplifies apex support significantly). Engram topic: sdd/agency-custom-domains/pending/apex-domain-support. |
| Multiple domains per agency | Lift the one-per-agency limit; requires UI redesign for multi-domain management. |
| Wildcard custom domains | Support *.kfc.com; requires wildcard TLS per custom root domain. |
| Background auto-verification | Background job (e.g. BullMQ) that polls DNS and auto-transitions pending_dns domains without requiring admin action. |