Architecture Patterns for AI-Governed Systems
Choosing the right integration pattern determines how much governance coverage you get, how latency behaves, and how independently teams can operate. This guide covers the three primary deployment patterns for the Keeptrusts gateway and when to use each.
Use this page when
- You are choosing between sidecar, reverse-proxy, or SDK integration for the Keeptrusts gateway
- You need to decide between centralized and distributed gateway topologies
- You are designing a multi-region AI governance deployment
- You want to understand the trade-offs of each pattern for latency, bypass risk, and operational complexity
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Integration Patterns
Sidecar Pattern
The gateway runs as a co-located process alongside each application instance. Every outbound LLM call routes through localhost:
Configuration:
gateway:
listen_port: 41002
mode: local
providers:
targets:
- id: openai
provider:
base_url: https://api.openai.com/v1
secret_key_ref:
env: OPENAI_API_KEY
policies:
- name: pii-redaction
type: output_filter
action: redact
patterns:
- email
- phone
- ssn
When to use:
- Microservice architectures with per-pod governance
- Teams that own their own policy configs
- Low-latency requirements (no network hop)
- Kubernetes deployments with sidecar injection
Trade-offs:
- Each pod runs its own gateway process (memory overhead ~30 MB)
- Policy updates require pod restarts or config-reload signals
- Event fan-out increases load on the control-plane API
Reverse-Proxy Pattern
A shared gateway fleet sits between all applications and upstream LLM providers:
Configuration:
gateway:
listen_port: 41002
providers:
targets:
- id: openai
provider:
base_url: https://api.openai.com/v1
secret_key_ref:
store: OPENAI_API_KEY
- id: anthropic
provider:
base_url: https://api.anthropic.com/v1
secret_key_ref:
store: ANTHROPIC_API_KEY
policies:
- name: global-safety
type: content_filter
action: block
categories:
- hate
- violence
- self_harm
When to use:
- Centralized governance across multiple teams
- Shared provider credentials managed via config variables
- Uniform policy enforcement regardless of calling service
- Simplified key rotation (one place to update)
Trade-offs:
- Added network hop (typically 2–5 ms within the same region)
- Single point requiring high availability
- All traffic funnels through the gateway fleet
SDK Wrapper Pattern
Applications use the Keeptrusts SDK, which wraps the standard LLM client and applies policies client-side before forwarding to the gateway or directly to providers:
When to use:
- Lightweight integration for prototyping
- Client-side input validation before network calls
- Applications that need synchronous policy feedback in-process
Trade-offs:
- Policies are enforced at the application layer, not the network layer
- Bypassing the SDK bypasses governance
- Requires SDK updates when policies change
Centralized vs Distributed Topologies
Centralized Gateway
A single fleet in one region handles all traffic. The control-plane API manages configuration, and gateways fetch config on startup or reload.
# Start a hosted gateway
export KEEPTRUSTS_API_URL="https://api.keeptrusts.com"
export KEEPTRUSTS_GATEWAY_TOKEN="$KT_GATEWAY_KEY"
kt gateway run \
--listen 0.0.0.0:41002
Best for: Single-region deployments, smaller organizations, simpler operational footprint.
Distributed Gateway
Multiple gateway fleets in different regions, all reporting to the same control-plane API. Policies are synchronized via config reload.
Best for: Multi-region deployments, data residency requirements, latency-sensitive global applications.
Decision Matrix
| Factor | Sidecar | Reverse Proxy | SDK |
|---|---|---|---|
| Network hop | None (localhost) | 1 extra hop | None |
| Bypass risk | Low (pod-level) | Low (network-level) | High |
| Memory overhead | Per-pod | Shared fleet | In-process |
| Policy consistency | Per-pod config | Fleet-wide | Per-app |
| Key management | Per-pod env | Centralized config vars | Per-app |
| Operational complexity | Medium | Low | Low |
Hybrid Approaches
Most production deployments combine patterns:
- Internal services route through the centralized gateway for uniform policy and shared credentials
- Edge or latency-critical services run sidecar gateways for local enforcement
- All gateways report events to the same control-plane API for unified observability
Configuration Patterns
Environment-Based Provider Selection
pack:
name: architecture-patterns-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: openai
provider:
base_url: "${OPENAI_BASE_URL:-https://api.openai.com/v1}"
secret_key_ref:
env: OPENAI_API_KEY
- id: azure-openai
provider:
base_url: "${AZURE_OPENAI_ENDPOINT}/openai/deployments"
secret_key_ref:
env: AZURE_OPENAI_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Git-Synced Configuration
# Managed via console → Configurations
# The API polls the linked repo and pushes reloads to gateways
gateway:
config_reload: true
config_poll_interval: 60s
Next steps
- System Design: Integrating the AI Gateway — request flow and latency budgets
- Resilience Engineering for AI Services — failover and circuit breakers
- Security Engineering for AI Pipelines — TLS, mTLS, and RBAC
For AI systems
- Canonical terms: sidecar pattern, reverse-proxy pattern, SDK wrapper, hosted gateway, distributed gateway, config-reload, git-linked repos, policy-config.yaml
- Key configuration:
providers[].secret_key_ref,gateway.config_reload,config_poll_interval - Best next pages: System Design: Integrating the AI Gateway, Microservices Architecture with AI Gateway, Resilience Engineering
For engineers
- Sidecar: ~30 MB memory per pod, localhost access (no network hop), requires pod restart or config-reload for policy updates
- Reverse proxy: 2–5 ms added network hop within same region, fleet-wide policy consistency, centralized key management via config variables
- SDK: No network overhead but high bypass risk — policies enforced at application layer only
- Start with:
export KEEPTRUSTS_API_URL=<API_URL>,export KEEPTRUSTS_GATEWAY_TOKEN=$KT_GATEWAY_KEY, then runkt gateway run --listen 0.0.0.0:41002
For leaders
- Sidecar pattern gives teams autonomy over their own policies but increases operational surface — best for mature platform teams
- Centralized reverse-proxy reduces operational footprint and ensures uniform enforcement — best for organizations starting their governance journey
- Hybrid approaches (centralized for internal services, sidecar for latency-critical edge) are the most common production pattern