// Case studies
ALB weighted routing for a zero-downtime API cutover
A fintech payments service migrated Node versions behind an Application Load Balancer, 5% canary to full cutover with connection draining and no failed transactions.
By Simplileap · Published May 28, 2025 · 9 min read
A payments orchestration API, Node 16 monolith on EC2, PostgreSQL, Redis rate limits, needed Node 20 and OpenSSL 3 compatibility without a maintenance window. Regulatory constraints required auditable rollback within 15 minutes.
Prior big-bang deploys had caused brief 502 spikes during target registration. Leadership mandated canary validation on real traffic fractions.
Architecture: second Auto Scaling group (green) behind existing ALB; weighted target group routing starting 5% / 95%; health checks on /healthz with dependency probes (DB, Redis); deregistration delay set to 120s for connection draining.
Problems encountered: sticky sessions on an older mobile client pinned some users to blue for days, mitigated by TTL reduction and forced rebalance after 24h; green instances failed health check until we increased grace period for JVM-adjacent native module warmup; idempotency keys on POST /transfer masked duplicate retries during a 30-second routing flap, validated via reconciliation job.
Runbook: promote weights 5 → 25 → 50 → 100 over four hours with automatic rollback if 5xx rate exceeds 0.1% for five minutes; CloudWatch alarms on p99 latency and error budget.
Outcome: full cutover with zero failed transactions in settlement reconciliation; p99 latency improved 18% on green due to Node 20 event loop gains. Described as a regulated fintech API partner, name withheld.
// Related services
Ready to scope your next initiative?
Share your goals with our Bangalore team. We respond within one business day with a clear path from discovery to delivery.
