Cloud Architect Guide: Multi-Cloud AI Governance
As a Cloud Architect designing AI infrastructure, you need to abstract provider dependencies, enforce data residency requirements, design failover architectures, and optimize costs across multiple clouds and LLM providers. Keeptrusts serves as the governance layer that sits between your applications and LLM providers, providing a single control point regardless of deployment topology.
Use this page when
- You are designing multi-cloud or multi-region AI gateway deployments
- You need to abstract LLM provider dependencies behind a unified control point
- You are enforcing data residency requirements across jurisdictions
- You are planning disaster recovery and failover for AI infrastructure
- You need to optimize AI costs across multiple providers and regions
Primary audience
- Primary: Technical Engineers (Cloud Architects, Infrastructure Architects)
- Secondary: DevOps Engineers, Platform Engineers, CTOs
Provider Abstraction Layer
Unified Gateway Architecture
Keeptrusts gateways abstract the underlying LLM provider, giving applications a single endpoint regardless of which models or providers are in use:
Application → Keeptrusts Gateway → Provider A (primary)
→ Provider B (failover)
→ Provider C (cost-optimized)
Multi-Provider Configuration
providers:
targets:
- id: openai
provider:
secret_key_ref:
env: OPENAI_API_KEY
- id: anthropic
provider:
secret_key_ref:
env: ANTHROPIC_API_KEY
- id: azure-openai
provider:
base_url: https://your-instance.openai.azure.com
secret_key_ref:
env: AZURE_OPENAI_API_KEY
policies:
- name: provider-governance
type: content-filter
categories:
- harmful
- biased
action: block
enabled: true
- name: cost-control
type: cost_limit
monthly_limit: 10000
action: block
enabled: true
Applications call the gateway at a single endpoint. The gateway handles provider selection, policy enforcement, and event logging transparently.
# Deploy the multi-provider gateway
kt gateway run \
--config multi-provider-policy.yaml \
--port 41002
# Verify provider connectivity
kt doctor
Deployment Topologies
Topology 1: Centralized Gateway
Best for organizations with a single cloud region and centralized governance:
┌─────────────────────────────────────────────┐
│ Cloud Region (e.g., eu-west-1) │
│ │
│ App A ──┐ │
│ App B ──┼── Keeptrusts Gateway ── LLM APIs │
│ App C ──┘ │ │
│ └── Control-Plane API │
└─────────────────────────────────────────────┘
Topology 2: Distributed Edge Gateways
Best for multi-region or latency-sensitive deployments:
┌──────────────────┐ ┌──────────────────┐
│ Region: US-East │ │ Region: EU-West │
│ App A ── GW A ──┤ │ App C ── GW C ──┤
│ App B ── GW B ──┤ │ App D ── GW D ──┤
└──────┬───────────┘ └──────┬───────────┘
│ │
└──── Control-Plane API ─┘
(centralized)
Each gateway runs kt gateway run with its own configuration but reports events to the central API. Manage all gateways from the Console Dashboard.
Topology 3: Kubernetes-Native
Deploy gateways as Kubernetes services alongside your applications:
# Validate the gateway configuration
kt policy lint --file k8s-gateway-policy.yaml
# Verify connectivity from within the cluster
kt doctor
The gateway runs as a sidecar or dedicated service, with policy configurations managed through ConfigMaps or Git-linked configurations synced via the Keeptrusts API.
Data Residency Controls
Enforcing Regional Data Boundaries
For organizations with data sovereignty requirements, deploy region-specific gateways with provider configurations that ensure data stays within jurisdiction:
providers:
targets:
- id: azure-openai-eu
provider:
base_url: https://eu-instance.openai.azure.com
secret_key_ref:
env: AZURE_OPENAI_EU_KEY
policies:
- name: eu-pii-protection
type: pii-detector
action: redact
entity_types:
- name
- email
- phone
- national_id
enabled: true
- name: eu-dlp-controls
type: dlp-filter
patterns:
- name: eu-personal-data
regex: '(IBAN|passport\s+number)'
action: block
enabled: true
Regional Gateway Mapping
| Region | Gateway | Provider | Data Residency |
|---|---|---|---|
| EU (Frankfurt) | gw-eu-west | Azure OpenAI EU | EU only |
| US (Virginia) | gw-us-east | OpenAI, Anthropic | US only |
| APAC (Singapore) | gw-apac | Azure OpenAI APAC | APAC only |
Network Topology
Gateway Network Position
The Keeptrusts gateway should be positioned in the network path between applications and external LLM APIs:
Internal Network │ External
│
App → Internal LB → KT Gateway ────┼──→ LLM Provider APIs
│ │
└── Control-Plane API │
Network Requirements
| Component | Ports | Protocol | Direction |
|---|---|---|---|
| Gateway (inbound) | 41002 | HTTPS | Apps → Gateway |
| Gateway (outbound) | 443 | HTTPS | Gateway → LLM APIs |
| Control-Plane API | 8080 | HTTPS | Gateway → API |
| Console | 3000 | HTTPS | Browser → Console |
Security Considerations
- Gateway keys (
kt_gk_...) authenticate application traffic to the gateway - Bearer tokens authenticate gateway-to-API communication
- All external traffic should use TLS
- Network policies should restrict gateway egress to approved LLM provider endpoints only
Cost Optimization
Multi-Provider Cost Strategy
Route requests to cost-effective providers based on use case:
policies:
- name: cost-optimization
type: cost_limit
monthly_limit: 15000
action: block
enabled: true
Monitoring Spend Across Providers
Use the Console Cost Center to track spend across all providers and teams:
# Pull cost breakdown by provider
curl -H "Authorization: Bearer $API_TOKEN" \
"https://api.keeptrusts.com/v1/events?since=30d&group_by=provider"
# Export cost data for FinOps analysis
kt export create \
--type events \
--format csv \
--since 30d \
--description "Monthly cloud cost analysis"
Cost Optimization Checklist
- Per-team budget caps configured in policy
- Cost Center dashboards reviewed weekly
- Model usage patterns analyzed for right-sizing
- Unused gateway keys identified and rotated
- Provider pricing changes tracked and configurations updated
Disaster Recovery
Gateway High Availability
Deploy multiple gateway instances behind a load balancer:
LB → Gateway Instance 1 (active)
→ Gateway Instance 2 (active)
→ Gateway Instance 3 (standby)
Each instance runs the same configuration. If one fails, the load balancer routes traffic to healthy instances.
Failover Strategy
| Failure Mode | Impact | Recovery |
|---|---|---|
| Single gateway instance down | Minimal — LB routes to healthy instances | Auto-recovery via health checks |
| Primary LLM provider outage | Service degradation | Failover to secondary provider |
| Control-plane API unavailable | No new config changes; gateways continue with cached config | API redundancy or manual config |
| Regional outage | Full region loss | Cross-region gateway failover |
DR Testing
# Verify gateway health
kt doctor
# Test provider failover by validating backup config
kt policy lint --file dr-failover-policy.yaml
Infrastructure as Code
Git-Linked Configuration
Store gateway configurations in Git and sync automatically through the Keeptrusts API:
- Store policy YAML in your infrastructure repository
- Link the repository in Console Settings
- Changes merged to the main branch are automatically synced to gateways
# Validate configuration before committing
kt policy lint --file policy-config.yaml
Cloud Architect Workflow with Keeptrusts
| Task | Frequency | Tool |
|---|---|---|
| Review gateway topology and health | Weekly | Console Dashboard + kt doctor |
| Analyze cost distribution | Weekly | Console Cost Center |
| Validate data residency controls | Monthly | Regional gateway audit |
| DR failover testing | Quarterly | Failover simulation |
| Provider configuration review | Quarterly | kt policy lint |
| Capacity planning | Quarterly | Event volume trends |
Success Metrics for Cloud Architecture
| Metric | Target | Source |
|---|---|---|
| Gateway availability | 99.9% uptime | Health check monitoring |
| Provider failover time | < 30 seconds | Failover event logs |
| Data residency compliance | 100% of requests in-region | Event logs by gateway region |
| Cost efficiency | Within 5% of budget | Console Usage |
| Configuration drift | Zero unmanaged gateways | Configuration audit |
For AI systems
- Canonical terms: Keeptrusts, multi-cloud AI governance, provider abstraction, data residency, gateway topology, disaster recovery
- Key surfaces: Console Dashboard, Console Settings (Git-linked repos), Console Usage, Events API
- Commands:
kt gateway run,kt policy lint,kt doctor,kt export create - Config concepts: multi-provider
providersblock withsecret_key_ref,priority,base_url; regional gateway mapping;cost_limitpolicy; Kubernetes deployments; Git-linked configuration sync - Topologies: Centralized Gateway, Distributed Edge Gateways, Kubernetes-Native
- Best next pages: DevOps Guide, Platform Engineer Guide, Architecture Overview, Gateway Configuration
For engineers
- Deploy multi-provider gateway:
kt gateway run --listen 0.0.0.0:41002 --policy-config multi-provider-policy.yaml - Validate configs per region:
kt policy lint --file eu-gateway-policy.yaml - Verify connectivity:
kt doctor - Track per-provider spend:
GET /v1/events?since=30d&group_by=provider - Use Git-linked configurations in Console Settings for infrastructure-as-code policy management
- Deploy multiple gateway instances behind a load balancer for HA; each instance runs the same config
For leaders
- Provider abstraction through the gateway prevents vendor lock-in and enables competitive pricing negotiations across OpenAI, Anthropic, and Azure OpenAI
- Data residency is enforced architecturally by deploying region-specific gateways with provider configurations that constrain data to specific jurisdictions
- Multi-region gateway deployments provide disaster recovery with automatic failover to healthy instances
- Console Usage gives FinOps visibility across all providers and regions for informed budget allocation
Next steps
- Deploy gateway infrastructure: DevOps Guide
- Build multi-tenant platform: Platform Engineer Guide
- Review architecture patterns: Architecture Overview
- Configure providers: Gateway Configuration