Multi-Tenant SaaS Domain Strategy: Custom Domains, SSL & Isolation
Architecting reliable custom domain onboarding: DNS validation, certificate automation, routing, and tenant isolation.
By Platform Engineering•8/6/2025•2 min read
saasmulti-tenantcustom-domains
Multi-Tenant SaaS Domain Strategy: Custom Domains, SSL & Isolation
Supporting customer custom domains elevates perceived product quality—and complexity. Build a resilient pipeline.
Onboarding Flow
- User adds domain (app settings)
- Provide required DNS targets (CNAME or A + TXT)
- Poll DNS until ownership proof & resolution succeed
- Request certificate (ACME) after DNS ready
- Activate routing entry & purge caches
DNS Validation Patterns
Method | Record | Pros | Cons |
---|---|---|---|
TXT token | _acme-challenge.example.com | Secure, explicit | Prop delay |
CNAME pointing | app.example.com -> cname.prod.edge | Simple UX | Some registrars rewrite |
HTTP file | http://example.com/.well-known/ | Fast | Needs apex resolved first |
Certificate Automation
- Use batch ACME client (e.g., step-ca, Certbot automation)
- Store private keys encrypted (KMS) with rotation
- Pre-warm renewal 30 days out; retry jitter
Routing Layer
- Edge (CDN or reverse proxy) maintains host -> tenant map
- Wildcard fallback to 404 to avoid leakage
- Rate limit misconfigured host bursts
Isolation & Security
- Per-tenant origin auth (signed headers)
- Header sanitation (Host, X-Forwarded-*)
- Enforce HTTPS redirect after cert ready
- WAF ruleset opt-in for high-risk tenants
Observability
- Domain onboarding duration p95
- DNS error state counts
- Certificate issuance failures & retries
- Per-tenant 4xx/5xx anomaly alerts
Data Model Fields
TenantDomain {
id, tenantId, domain, status: [pending_dns, validating, provisioning_cert, active, error],
createdAt, activatedAt, lastCheckAt, failureReason,
validationMethod, dnsRecordsExpected: [{type, host, value}],
certExpiresAt
}
Failure Handling
Stage | Failure | Action |
---|---|---|
DNS proof | TXT missing | Email reminder + UI nudge |
Cert issue | ACME rate limit | Backoff + rotate solver region |
Routing | Host collision | Reject + surface conflict domain |
Want a starter Prisma model or API route added? Just say so.