Uptime and Reliability
RevKeen's infrastructure, redundancy, and uptime commitments
RevKeen is built on infrastructure designed for high availability. This page covers how we keep the platform running, what happens when things go wrong, and what uptime you can expect.
Infrastructure Overview
RevKeen's core services run on AWS in the eu-west-1 (Ireland) region, with the primary database hosted on Supabase in Frankfurt (EU).
| Component | Technology | Availability Design |
|---|---|---|
| API and application server | AWS Fargate (ECS) | Multiple containers across availability zones |
| Database | Supabase managed PostgreSQL | Automated backups, point-in-time recovery |
| Background jobs | Trigger.dev Cloud | Managed execution with automatic retries |
| Edge and CDN | Cloudflare / Vercel | Global edge network with automatic failover |
| Real-time messaging | Upstash Redis | Managed Redis with replication |
| DNS | Cloudflare | Anycast DNS with DDoS protection |
Multi-AZ Deployment
RevKeen's Fargate services run across multiple AWS Availability Zones. If one AZ experiences an outage, traffic is automatically routed to healthy containers in other zones. There is no single point of failure at the application tier.
Auto-scaling
Application containers scale automatically based on CPU and memory utilization. During traffic spikes -- such as a large batch of invoices being sent -- additional containers are launched to handle the load without degrading response times.
Database Reliability
RevKeen's primary database is hosted on Supabase's managed PostgreSQL platform:
- Automated daily backups with configurable retention.
- Point-in-time recovery (PITR) allows restoring the database to any second within the retention window.
- Connection pooling via Supabase's built-in PgBouncer ensures stable performance under high connection counts.
- Read replicas can be provisioned for read-heavy workloads without impacting write performance.
Database maintenance (such as PostgreSQL version upgrades) is managed by Supabase with minimal downtime, typically during low-traffic windows.
Status Page
RevKeen publishes real-time platform status and historical uptime at:
The status page shows:
- Current operational status for all services (API, dashboard, checkout, webhooks).
- Active and resolved incidents with timestamps and impact descriptions.
- Scheduled maintenance windows.
- Historical uptime metrics.
You can subscribe to status updates via email or RSS to receive notifications when incidents are reported or resolved.
Incident Response
When an issue is detected, RevKeen follows a structured incident response process:
Detection
Incidents are detected through multiple channels:
- Automated monitoring -- Grafana alerts on error rates, latency spikes, and infrastructure anomalies.
- Synthetic checks -- Periodic health checks against critical endpoints (API, checkout, webhooks).
- Customer reports -- Issues reported through support channels.
Response Timeline
| Severity | Definition | Response Target | Update Frequency |
|---|---|---|---|
| Critical | Payment processing or checkout is unavailable | 15 minutes | Every 30 minutes |
| High | Major feature degraded (dashboard, webhooks) | 30 minutes | Every hour |
| Medium | Non-critical feature impacted | 2 hours | As progress is made |
| Low | Minor issue, no customer impact | Next business day | On resolution |
Process
- Acknowledge -- The on-call engineer acknowledges the alert and begins investigation.
- Communicate -- A status page update is posted describing the issue and estimated impact.
- Mitigate -- The immediate priority is restoring service, even if the root cause is not yet identified.
- Resolve -- The underlying issue is fixed and verified.
- Review -- A post-incident review identifies root cause, contributing factors, and preventive measures.
Post-incident reviews for Critical and High severity incidents are shared with affected merchants upon request.
Planned Maintenance
RevKeen schedules maintenance windows to minimize disruption:
- Routine maintenance is performed during low-traffic periods, typically weekday mornings (UTC).
- Advance notice is provided at least 48 hours before any maintenance that may cause downtime.
- Zero-downtime deployments are the default for application updates. New containers are started and verified before old containers are drained.
- Database maintenance follows Supabase's managed upgrade process, which typically involves seconds of downtime rather than minutes.
Scheduled maintenance is announced on the status page and via email to account administrators.
SLA Overview
RevKeen targets the following service levels:
| Metric | Target |
|---|---|
| API availability (monthly) | 99.9% |
| Checkout availability (monthly) | 99.9% |
| Webhook delivery (first attempt) | Within 30 seconds of event |
| Webhook delivery (with retries) | Within 24 hours, with exponential backoff |
| Dashboard availability | 99.5% |
| Planned maintenance downtime | Less than 1 hour per month |
Availability is measured as the percentage of time the service responds to valid requests with non-error responses, excluding scheduled maintenance windows.
For merchants on enterprise plans, custom SLAs with financial commitments are available. Contact sales@revkeen.com for details.
What Happens During an Outage
If RevKeen experiences downtime, here is what you can expect:
- Checkout -- If the checkout service is unavailable, customers will see an error page. No partial charges will be created.
- Webhooks -- Events that occur during an outage are queued and delivered with retries once service is restored. You will not miss events.
- Subscriptions -- Renewal attempts that fail due to a RevKeen outage are automatically retried. No subscriptions are cancelled due to platform downtime.
- Dashboard -- The dashboard may be temporarily unavailable, but no data is lost. All transactions continue to be recorded and will appear once the dashboard is restored.