Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Architecture Patterns for AI-Governed Systems

Choosing the right integration pattern determines how much governance coverage you get, how latency behaves, and how independently teams can operate. This guide covers the three primary deployment patterns for the Keeptrusts gateway and when to use each.

Use this page when

  • You are choosing between sidecar, reverse-proxy, or SDK integration for the Keeptrusts gateway
  • You need to decide between centralized and distributed gateway topologies
  • You are designing a multi-region AI governance deployment
  • You want to understand the trade-offs of each pattern for latency, bypass risk, and operational complexity

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Integration Patterns

Sidecar Pattern

The gateway runs as a co-located process alongside each application instance. Every outbound LLM call routes through localhost:

Configuration:

gateway:
listen_port: 41002
mode: local
providers:
targets:
- id: openai
provider:
base_url: https://api.openai.com/v1
secret_key_ref:
env: OPENAI_API_KEY
policies:
- name: pii-redaction
type: output_filter
action: redact
patterns:
- email
- phone
- ssn

When to use:

  • Microservice architectures with per-pod governance
  • Teams that own their own policy configs
  • Low-latency requirements (no network hop)
  • Kubernetes deployments with sidecar injection

Trade-offs:

  • Each pod runs its own gateway process (memory overhead ~30 MB)
  • Policy updates require pod restarts or config-reload signals
  • Event fan-out increases load on the control-plane API

Reverse-Proxy Pattern

A shared gateway fleet sits between all applications and upstream LLM providers:

Configuration:

gateway:
listen_port: 41002
providers:
targets:
- id: openai
provider:
base_url: https://api.openai.com/v1
secret_key_ref:
store: OPENAI_API_KEY
- id: anthropic
provider:
base_url: https://api.anthropic.com/v1
secret_key_ref:
store: ANTHROPIC_API_KEY
policies:
- name: global-safety
type: content_filter
action: block
categories:
- hate
- violence
- self_harm

When to use:

  • Centralized governance across multiple teams
  • Shared provider credentials managed via config variables
  • Uniform policy enforcement regardless of calling service
  • Simplified key rotation (one place to update)

Trade-offs:

  • Added network hop (typically 2–5 ms within the same region)
  • Single point requiring high availability
  • All traffic funnels through the gateway fleet

SDK Wrapper Pattern

Applications use the Keeptrusts SDK, which wraps the standard LLM client and applies policies client-side before forwarding to the gateway or directly to providers:

When to use:

  • Lightweight integration for prototyping
  • Client-side input validation before network calls
  • Applications that need synchronous policy feedback in-process

Trade-offs:

  • Policies are enforced at the application layer, not the network layer
  • Bypassing the SDK bypasses governance
  • Requires SDK updates when policies change

Centralized vs Distributed Topologies

Centralized Gateway

A single fleet in one region handles all traffic. The control-plane API manages configuration, and gateways fetch config on startup or reload.

# Start a hosted gateway
export KEEPTRUSTS_API_URL="https://api.keeptrusts.com"
export KEEPTRUSTS_GATEWAY_TOKEN="$KT_GATEWAY_KEY"

kt gateway run \
--listen 0.0.0.0:41002

Best for: Single-region deployments, smaller organizations, simpler operational footprint.

Distributed Gateway

Multiple gateway fleets in different regions, all reporting to the same control-plane API. Policies are synchronized via config reload.

Best for: Multi-region deployments, data residency requirements, latency-sensitive global applications.

Decision Matrix

FactorSidecarReverse ProxySDK
Network hopNone (localhost)1 extra hopNone
Bypass riskLow (pod-level)Low (network-level)High
Memory overheadPer-podShared fleetIn-process
Policy consistencyPer-pod configFleet-widePer-app
Key managementPer-pod envCentralized config varsPer-app
Operational complexityMediumLowLow

Hybrid Approaches

Most production deployments combine patterns:

  • Internal services route through the centralized gateway for uniform policy and shared credentials
  • Edge or latency-critical services run sidecar gateways for local enforcement
  • All gateways report events to the same control-plane API for unified observability

Configuration Patterns

Environment-Based Provider Selection

pack:
name: architecture-patterns-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: openai
provider:
base_url: "${OPENAI_BASE_URL:-https://api.openai.com/v1}"
secret_key_ref:
env: OPENAI_API_KEY
- id: azure-openai
provider:
base_url: "${AZURE_OPENAI_ENDPOINT}/openai/deployments"
secret_key_ref:
env: AZURE_OPENAI_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Git-Synced Configuration

# Managed via console → Configurations
# The API polls the linked repo and pushes reloads to gateways
gateway:
config_reload: true
config_poll_interval: 60s

Next steps

For AI systems

For engineers

  • Sidecar: ~30 MB memory per pod, localhost access (no network hop), requires pod restart or config-reload for policy updates
  • Reverse proxy: 2–5 ms added network hop within same region, fleet-wide policy consistency, centralized key management via config variables
  • SDK: No network overhead but high bypass risk — policies enforced at application layer only
  • Start with: export KEEPTRUSTS_API_URL=<API_URL>, export KEEPTRUSTS_GATEWAY_TOKEN=$KT_GATEWAY_KEY, then run kt gateway run --listen 0.0.0.0:41002

For leaders

  • Sidecar pattern gives teams autonomy over their own policies but increases operational surface — best for mature platform teams
  • Centralized reverse-proxy reduces operational footprint and ensures uniform enforcement — best for organizations starting their governance journey
  • Hybrid approaches (centralized for internal services, sidecar for latency-critical edge) are the most common production pattern