Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Microservices Architecture with AI Gateway

Microservice architectures introduce unique challenges for AI governance — multiple services making independent LLM calls need consistent policy enforcement. This guide covers deployment topologies, service discovery patterns, and configuration propagation strategies.

Use this page when

  • You have multiple microservices making independent LLM calls that need consistent policy enforcement
  • You are choosing between shared (centralized) and sidecar (per-service) gateway deployments in Kubernetes
  • You need service discovery patterns for gateway access in containerized environments
  • You want to propagate configuration changes to sidecar gateways without redeploying all services

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Deployment Topologies

Shared Gateway (Centralized)

A single gateway instance serves all microservices. Simple to operate but creates a shared dependency:

Configuration:

# docker-compose.yml — shared gateway
services:
kt-gateway:
image: keeptrusts/gateway:latest
ports:
- "41002:41002"
environment:
KEEPTRUSTS_API_URL: http://keeptrusts-api:8080
OPENAI_API_KEY: ${OPENAI_API_KEY}
volumes:
- ./policy-config.yaml:/etc/keeptrusts/policy-config.yaml

service-a:
build: ./services/service-a
environment:
GATEWAY_URL: http://kt-gateway:41002

service-b:
build: ./services/service-b
environment:
GATEWAY_URL: http://kt-gateway:41002

When to use: Fewer than 10 services, uniform policy requirements, single team manages AI governance.

Sidecar Gateway (Per-Service)

Each service runs its own gateway instance with service-specific policies:

Kubernetes sidecar configuration:

# k8s/deployment-with-sidecar.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-a
spec:
template:
spec:
containers:
- name: service-a
image: myregistry/service-a:latest
env:
- name: GATEWAY_URL
value: "http://localhost:41002"

- name: kt-gateway
image: keeptrusts/gateway:latest
ports:
- containerPort: 41002
env:
- name: KEEPTRUSTS_API_URL
value: "http://keeptrusts-api.platform.svc:8080"
volumeMounts:
- name: policy-config
mountPath: /etc/keeptrusts
volumes:
- name: policy-config
configMap:
name: service-a-policy

When to use: Service-specific policies, independent scaling, blast radius isolation.

Hybrid Topology

Combine shared and sidecar patterns — a shared gateway for common services and dedicated sidecars for high-security workloads:

# Shared gateway for general services
services:
kt-gateway-shared:
image: keeptrusts/gateway:latest
environment:
KEEPTRUSTS_API_URL: http://keeptrusts-api:8080
volumes:
- ./policies/shared-policy.yaml:/etc/keeptrusts/policy-config.yaml

# Dedicated sidecar for PCI-scoped service
payment-ai-service:
build: ./services/payment-ai
environment:
GATEWAY_URL: http://localhost:41002

payment-gateway-sidecar:
image: keeptrusts/gateway:latest
network_mode: "service:payment-ai-service"
volumes:
- ./policies/pci-policy.yaml:/etc/keeptrusts/policy-config.yaml

Service Discovery

DNS-Based Discovery

Services resolve the gateway by DNS name. Works with Docker Compose, Kubernetes, and Consul:

# service_client.py
import os

GATEWAY_URL = os.environ.get("GATEWAY_URL", "http://kt-gateway:41002")

async def call_ai(prompt: str) -> dict:
"""Route AI call through discovered gateway."""
async with httpx.AsyncClient() as client:
return (await client.post(
f"{GATEWAY_URL}/v1/chat/completions",
json={"model": "gpt-4o-mini", "messages": [{"role": "user", "content": prompt}]},
headers={"Authorization": f"Bearer {os.environ['GATEWAY_KEY']}"},
)).json()

Kubernetes Service

# k8s/gateway-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kt-gateway
namespace: platform
spec:
selector:
app: kt-gateway
ports:
- port: 41002
targetPort: 41002

Services in any namespace reach the gateway at http://kt-gateway.platform.svc:41002.

Configuration Propagation

ConfigMap-Based (Kubernetes)

Store policy configs in ConfigMaps and propagate changes without redeploying services:

# k8s/policy-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: shared-policy
namespace: platform
data:
policy-config.yaml: |
gateway:
port: 41002
secret_key_ref:
env: OPENAI_API_KEY
policies:
- name: default
input:
- type: content_safety
action: block
categories: [hate, violence, self_harm, sexual]
- type: pii_detection
action: redact
entities: [ssn, credit_card]

Git-Backed Config Sync

Use the Keeptrusts git sync feature to propagate policies from a git repository:

# Link a git repository for config sync
curl -X POST https://api.keeptrusts.example/v1/git-repos \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://github.com/org/governance-policies.git",
"branch": "main",
"path": "policies/",
"auto_create_configuration": true,
"poll_interval_seconds": 300
}'

Changes pushed to the repository automatically propagate to all gateways bound to that configuration.

Per-Service Policy Overrides

Layer service-specific policies on top of shared base configs:

governance-policies/
├── base/
│ └── shared-policy.yaml # Common safety rules
├── services/
│ ├── analytics/
│ │ └── policy-config.yaml # Allows larger context windows
│ ├── customer-support/
│ │ └── policy-config.yaml # Strict PII redaction
│ └── internal-tools/
│ └── policy-config.yaml # Relaxed output policies

Health Checks and Readiness

Configure health probes so orchestrators route traffic only to healthy gateways:

# k8s/deployment.yaml — gateway health probes
containers:
- name: kt-gateway
livenessProbe:
httpGet:
path: /health
port: 41002
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 41002
initialDelaySeconds: 3
periodSeconds: 5

Comparison Matrix

CriterionShared GatewaySidecarHybrid
Operational complexityLowHighMedium
Policy isolationSharedPer-serviceMixed
Blast radiusAll servicesSingle serviceScoped
Resource overheadSingle instanceN instancesSelective
Config propagationSingle pointPer-pod ConfigMapBoth
LatencyNetwork hopLocalhostVaries

Key Takeaways

  • Start with a shared gateway topology and move to sidecars as policy requirements diverge
  • Use Kubernetes ConfigMaps or git-backed sync for policy propagation — avoid baking configs into images
  • Layer service-specific policy overrides on top of shared base configurations
  • Configure liveness and readiness probes so the orchestrator only routes to healthy gateways
  • Use DNS-based service discovery so gateway location is configurable per environment

For AI systems

  • Canonical terms: shared gateway, sidecar gateway, service mesh, Kubernetes sidecar injection, GATEWAY_URL, config propagation, per-service policies, consumer groups, gateway keys per service
  • Key configuration: docker-compose.yml shared gateway, Kubernetes Deployment with sidecar container, KEEPTRUSTS_API_URL, config-reload
  • Best next pages: Architecture Patterns for AI-Governed Systems, Capacity Planning, Resilience Engineering

For engineers

  • Shared gateway: all services set GATEWAY_URL: http://kt-gateway:41002 — best for < 10 services with uniform policies
  • Sidecar gateway: each pod has its own gateway container at localhost:41002 — best for per-service policy isolation
  • In Kubernetes, use sidecar injection via mutating admission webhook or manual container spec in Deployment
  • Config propagation: sidecar gateways fetch config from the control-plane API on startup; trigger reload via POST /v1/gateways/{id}/reload
  • Service discovery: use Kubernetes service names (kt-gateway.namespace.svc.cluster.local) for shared gateway access

For leaders

  • Shared gateway reduces operational overhead (single deployment to manage) but creates a shared dependency across all services
  • Sidecar pattern enables team autonomy — each team owns its policy configuration — but increases total resource consumption and operational surface
  • Governance coverage is complete only when every service routes through the gateway — audit for services making direct provider calls

Next steps