Multi-Region AI Gateway Deployment
Global organizations need AI gateways close to their users and compliant with regional data residency laws. This guide covers topology options, geo-routing, failover strategies, and data residency enforcement.
Use this page when
- You are deploying gateways across multiple geographic regions for latency or compliance reasons
- You need to choose between single-region hosted, distributed, or fully distributed topologies
- You want to implement geo-routing (DNS-based or load-balancer), data residency enforcement, or regional failover
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Topology Options
Single-Region Hosted Topology
A single hosted gateway cluster and API serve all regions. This is the simplest topology to operate, but it adds latency for distant users.
All Regions
└─→ Hosted Gateway Cluster (us-east-1)
└─→ Keeptrusts API + Postgres
When to use: Single-region organizations, proof-of-concept, or when sub-100ms gateway latency is not required.
Configure using the scale profile in the root compose file:
docker compose --profile scale up -d --scale keeptrusts-gateway-scalable=3
This deploys the hosted gateway behind an nginx ingress for horizontal scaling within a single region.
Distributed Topology
Gateways deployed per region with the Keeptrusts API handling control-plane operations.
US Users ──→ US Gateway Cluster ──→ US LLM Providers
EU Users ──→ EU Gateway Cluster ──→ EU LLM Providers
AP Users ──→ AP Gateway Cluster ──→ AP LLM Providers
│
└──→ Keeptrusts API (event ingestion, config sync)
When to use: Multi-region compliance, latency-sensitive workloads, or data residency requirements.
Fully Distributed
Gateways and API replicas in each region with cross-region database replication.
US: Gateway + API + Postgres (primary)
EU: Gateway + API + Postgres (read replica)
AP: Gateway + API + Postgres (read replica)
When to use: Maximum availability requirements, strict data sovereignty laws.
Geo-Routing
DNS-Based Routing
Use Route 53 or Cloudflare for latency-based DNS routing:
# Route 53 latency-based routing
aws route53 change-resource-record-sets \
--hosted-zone-id $ZONE_ID \
--change-batch '{
"Changes": [{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "gateway.example.com",
"Type": "A",
"SetIdentifier": "us-east-1",
"Region": "us-east-1",
"TTL": 60,
"ResourceRecords": [{"Value": "10.0.1.100"}]
}
}, {
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "gateway.example.com",
"Type": "A",
"SetIdentifier": "eu-west-1",
"Region": "eu-west-1",
"TTL": 60,
"ResourceRecords": [{"Value": "10.1.1.100"}]
}
}]
}'
Kubernetes Multi-Cluster
Deploy gateways across Kubernetes clusters with a service mesh:
# Gateway deployment per region
apiVersion: apps/v1
kind: Deployment
metadata:
name: keeptrusts-gateway
labels:
app: keeptrusts-gateway
region: eu-west-1
spec:
replicas: 3
template:
spec:
containers:
- name: gateway
image: keeptrusts/gateway:1.5.0
env:
- name: KEEPTRUSTS_REGION
value: eu-west-1
- name: KEEPTRUSTS_API_URL
value: https://api.example.com
- name: KEEPTRUSTS_GATEWAY_TOKEN
valueFrom:
secretKeyRef:
name: keeptrusts-gateway
key: gateway-key
Latency-Based Failover
Health-Aware Routing
Configure health checks that trigger failover when a regional gateway is degraded:
# Cloudflare load balancer
load_balancer:
name: gateway.example.com
fallback_pool: us-east-1
default_pools:
- us-east-1
- eu-west-1
- ap-southeast-1
region_pools:
WNAM: [us-east-1]
ENAM: [us-east-1]
WEU: [eu-west-1]
EEU: [eu-west-1]
OC: [ap-southeast-1]
SEAS: [ap-southeast-1]
monitors:
- path: /readyz
interval: 30
timeout: 10
expected_codes: "200"
consecutive_up: 2
consecutive_down: 3
Cross-Region Event Delivery
When a regional gateway fails over, events must still reach the Keeptrusts API:
# Gateway config with event delivery fallback
event_delivery:
primary: https://api.us-east-1.example.com/v1/events
fallback: https://api.eu-west-1.example.com/v1/events
retry:
max_attempts: 5
backoff: exponential
max_delay: 60s
Data Residency Compliance
Regional Policy Configurations
Enforce data residency through per-region policy configurations:
policies:
- name: eu-data-residency
type: data_residency
config:
allowed_regions:
- eu-west-1
- eu-central-1
blocked_providers:
- name: openai
reason: US-hosted, not GDPR compliant for this data class
required_providers:
- name: azure-openai-eu
region: westeurope
providers:
targets:
- id: azure-openai-eu
provider:
Provider Routing by Region
The hosted gateway resolves provider targets based on the normalized alias of target.provider. Configure region-specific provider mappings:
pack:
name: multi-region-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-us
provider:
- id: openai-eu
provider:
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
PII Handling by Jurisdiction
Apply region-specific PII redaction policies:
policies:
- name: gdpr-pii-redaction
type: pii_redaction
config:
regions: [eu-west-1, eu-central-1]
redact:
- email
- phone
- national_id
- ip_address
action: redact_before_send
- name: ccpa-pii-redaction
type: pii_redaction
config:
regions: [us-west-2]
redact:
- social_security_number
- financial_account
action: redact_before_send
Git-Based Regional Config Sync
Use Git-linked repositories with branch-per-region strategy:
keeptrusts-config/
├── main/ # Shared base config
│ └── base-policy.yaml
├── regions/
│ ├── us-east-1/
│ │ └── policy-config.yaml
│ ├── eu-west-1/
│ │ └── policy-config.yaml
│ └── ap-southeast-1/
│ └── policy-config.yaml
Link each regional gateway to its config path:
curl -X POST https://api.example.com/v1/git-repos \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://github.com/org/keeptrusts-config.git",
"path": "regions/eu-west-1/policy-config.yaml",
"branch": "main",
"auto_create_configuration": true,
"poll_interval_seconds": 300
}'
Monitoring Multi-Region Deployments
Add region labels to all metrics:
# Per-region request rate
sum(rate(keeptrusts_gateway_requests_total[5m])) by (region)
# Cross-region latency comparison
histogram_quantile(0.99,
sum(rate(keeptrusts_gateway_request_duration_seconds_bucket[5m])) by (le, region)
)
# Regional failover events
increase(keeptrusts_gateway_failover_total[1h]) by (source_region, target_region)
Next steps
- Configure Compliance Infrastructure for regional audit requirements
- Set up Disaster Recovery for cross-region failover procedures
- Review Upgrade Procedures for staged regional rollouts
For AI systems
- Canonical terms: multi-region, geo-routing, latency-based failover, data residency, single-region hosted topology, distributed topology, fully distributed, read replica
- Topology options: single-region hosted (single hosted gateway cluster + API), distributed (regional gateways + Keeptrusts API), fully distributed (regional gateways + API replicas + DB replication)
- Key config:
docker compose --profile scale up -d --scale keeptrusts-gateway-scalable=3, Route 53 latency-based routing, Cloudflare geo-routing - Metrics:
keeptrusts_gateway_requests_totalby region,keeptrusts_gateway_failover_total - Related pages: Compliance Infrastructure, Disaster Recovery, Upgrade Procedures
For engineers
- Start with a single-region hosted topology for simplicity; move to distributed when latency or data residency requires regional gateways
- In distributed topology, gateways in each region forward events to the Keeptrusts API for unified audit
- Use Route 53 latency-based routing or Cloudflare DNS for automatic geo-routing to the nearest gateway
- For data residency, configure per-region gateways to use only local LLM provider endpoints
- Fully distributed requires cross-region PostgreSQL replication (read replicas in secondary regions)
- Validate: measure p99 latency per region and confirm events appear in the Keeptrusts API from all regions
For leaders
- Multi-region deployment satisfies data residency laws (GDPR, regional sovereignty) by keeping traffic within borders
- Distributed topology trades operational complexity for lower user-facing latency and compliance coverage
- Regional failover provides continuity if a single region experiences an outage
- Staged regional rollouts reduce blast radius during upgrades (see Upgrade Procedures)
- Single-region hosted vs. distributed is a cost/compliance trade-off — start simple and distribute only when requirements demand it