Architecture Patterns for AI-Governed Systems

Choosing the right integration pattern determines how much governance coverage you get, how latency behaves, and how independently teams can operate. This guide covers the three primary deployment patterns for the Keeptrusts gateway and when to use each.

Use this page when

You are choosing between sidecar, reverse-proxy, or SDK integration for the Keeptrusts gateway
You need to decide between centralized and distributed gateway topologies
You are designing a multi-region AI governance deployment
You want to understand the trade-offs of each pattern for latency, bypass risk, and operational complexity

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Integration Patterns

Sidecar Pattern

The gateway runs as a co-located process alongside each application instance. Every outbound LLM call routes through localhost:

Configuration:

gateway:
  listen_port: 41002
  mode: local
providers:
  targets:
  - id: openai
    provider: 
    base_url: https://api.openai.com/v1
    secret_key_ref:
      env: OPENAI_API_KEY
policies:
- name: pii-redaction
  type: output_filter
  action: redact
  patterns:
  - email
  - phone
  - ssn

When to use:

Microservice architectures with per-pod governance
Teams that own their own policy configs
Low-latency requirements (no network hop)
Kubernetes deployments with sidecar injection

Trade-offs:

Each pod runs its own gateway process (memory overhead ~30 MB)
Policy updates require pod restarts or config-reload signals
Event fan-out increases load on the control-plane API

Reverse-Proxy Pattern

A shared gateway fleet sits between all applications and upstream LLM providers:

Configuration:

gateway:
  listen_port: 41002
providers:
  targets:
  - id: openai
    provider: 
    base_url: https://api.openai.com/v1
    secret_key_ref:
      store: OPENAI_API_KEY
  - id: anthropic
    provider: 
    base_url: https://api.anthropic.com/v1
    secret_key_ref:
      store: ANTHROPIC_API_KEY
policies:
- name: global-safety
  type: content_filter
  action: block
  categories:
  - hate
  - violence
  - self_harm

When to use:

Centralized governance across multiple teams
Shared provider credentials managed via config variables
Uniform policy enforcement regardless of calling service
Simplified key rotation (one place to update)

Trade-offs:

Added network hop (typically 2–5 ms within the same region)
Single point requiring high availability
All traffic funnels through the gateway fleet

SDK Wrapper Pattern

Applications use the Keeptrusts SDK, which wraps the standard LLM client and applies policies client-side before forwarding to the gateway or directly to providers:

When to use:

Lightweight integration for prototyping
Client-side input validation before network calls
Applications that need synchronous policy feedback in-process

Trade-offs:

Policies are enforced at the application layer, not the network layer
Bypassing the SDK bypasses governance
Requires SDK updates when policies change

Centralized vs Distributed Topologies

Centralized Gateway

A single fleet in one region handles all traffic. The control-plane API manages configuration, and gateways fetch config on startup or reload.

# Start a hosted gateway
export KEEPTRUSTS_API_URL="https://api.keeptrusts.com"
export KEEPTRUSTS_GATEWAY_TOKEN="$KT_GATEWAY_KEY"

kt gateway run \
  --listen 0.0.0.0:41002

Best for: Single-region deployments, smaller organizations, simpler operational footprint.

Distributed Gateway

Multiple gateway fleets in different regions, all reporting to the same control-plane API. Policies are synchronized via config reload.

Best for: Multi-region deployments, data residency requirements, latency-sensitive global applications.

Decision Matrix

Factor	Sidecar	Reverse Proxy	SDK
Network hop	None (localhost)	1 extra hop	None
Bypass risk	Low (pod-level)	Low (network-level)	High
Memory overhead	Per-pod	Shared fleet	In-process
Policy consistency	Per-pod config	Fleet-wide	Per-app
Key management	Per-pod env	Centralized config vars	Per-app
Operational complexity	Medium	Low	Low

Hybrid Approaches

Most production deployments combine patterns:

Internal services route through the centralized gateway for uniform policy and shared credentials
Edge or latency-critical services run sidecar gateways for local enforcement
All gateways report events to the same control-plane API for unified observability

Configuration Patterns

Environment-Based Provider Selection

pack:
  name: architecture-patterns-providers-3
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: openai
    provider: 
    base_url: "${OPENAI_BASE_URL:-https://api.openai.com/v1}"
    secret_key_ref:
      env: OPENAI_API_KEY
  - id: azure-openai
    provider: 
    base_url: "${AZURE_OPENAI_ENDPOINT}/openai/deployments"
    secret_key_ref:
      env: AZURE_OPENAI_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Git-Synced Configuration

# Managed via console → Configurations
# The API polls the linked repo and pushes reloads to gateways
gateway:
  config_reload: true
  config_poll_interval: 60s

Next steps

System Design: Integrating the AI Gateway — request flow and latency budgets
Resilience Engineering for AI Services — failover and circuit breakers
Security Engineering for AI Pipelines — TLS, mTLS, and RBAC

For AI systems

Canonical terms: sidecar pattern, reverse-proxy pattern, SDK wrapper, hosted gateway, distributed gateway, config-reload, git-linked repos, policy-config.yaml
Key configuration: providers[].secret_key_ref, gateway.config_reload, config_poll_interval
Best next pages: System Design: Integrating the AI Gateway, Microservices Architecture with AI Gateway, Resilience Engineering

For engineers

Sidecar: ~30 MB memory per pod, localhost access (no network hop), requires pod restart or config-reload for policy updates
Reverse proxy: 2–5 ms added network hop within same region, fleet-wide policy consistency, centralized key management via config variables
SDK: No network overhead but high bypass risk — policies enforced at application layer only
Start with: export KEEPTRUSTS_API_URL=<API_URL>, export KEEPTRUSTS_GATEWAY_TOKEN=$KT_GATEWAY_KEY, then run kt gateway run --listen 0.0.0.0:41002

For leaders

Sidecar pattern gives teams autonomy over their own policies but increases operational surface — best for mature platform teams
Centralized reverse-proxy reduces operational footprint and ensures uniform enforcement — best for organizations starting their governance journey
Hybrid approaches (centralized for internal services, sidecar for latency-critical edge) are the most common production pattern

Use this page when​

Primary audience​

Integration Patterns​

Sidecar Pattern​

Reverse-Proxy Pattern​

SDK Wrapper Pattern​

Centralized vs Distributed Topologies​

Centralized Gateway​

Distributed Gateway​

Decision Matrix​

Hybrid Approaches​

Configuration Patterns​

Environment-Based Provider Selection​

Git-Synced Configuration​

Next steps​

For AI systems​

For engineers​

For leaders​

Use this page when

Primary audience

Integration Patterns

Sidecar Pattern

Reverse-Proxy Pattern

SDK Wrapper Pattern

Centralized vs Distributed Topologies

Centralized Gateway

Distributed Gateway

Decision Matrix

Hybrid Approaches

Configuration Patterns

Environment-Based Provider Selection

Git-Synced Configuration

Next steps

For AI systems

For engineers

For leaders