Microservices Architecture with AI Gateway

Microservice architectures introduce unique challenges for AI governance — multiple services making independent LLM calls need consistent policy enforcement. This guide covers deployment topologies, service discovery patterns, and configuration propagation strategies.

Use this page when

You have multiple microservices making independent LLM calls that need consistent policy enforcement
You are choosing between shared (centralized) and sidecar (per-service) gateway deployments in Kubernetes
You need service discovery patterns for gateway access in containerized environments
You want to propagate configuration changes to sidecar gateways without redeploying all services

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Deployment Topologies

Shared Gateway (Centralized)

A single gateway instance serves all microservices. Simple to operate but creates a shared dependency:

Configuration:

# docker-compose.yml — shared gateway
services:
  kt-gateway:
    image: keeptrusts/gateway:latest
    ports:
      - "41002:41002"
    environment:
      KEEPTRUSTS_API_URL: http://keeptrusts-api:8080
      OPENAI_API_KEY: ${OPENAI_API_KEY}
    volumes:
      - ./policy-config.yaml:/etc/keeptrusts/policy-config.yaml

  service-a:
    build: ./services/service-a
    environment:
      GATEWAY_URL: http://kt-gateway:41002

  service-b:
    build: ./services/service-b
    environment:
      GATEWAY_URL: http://kt-gateway:41002

When to use: Fewer than 10 services, uniform policy requirements, single team manages AI governance.

Sidecar Gateway (Per-Service)

Each service runs its own gateway instance with service-specific policies:

Kubernetes sidecar configuration:

# k8s/deployment-with-sidecar.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: service-a
spec:
  template:
    spec:
      containers:
        - name: service-a
          image: myregistry/service-a:latest
          env:
            - name: GATEWAY_URL
              value: "http://localhost:41002"

        - name: kt-gateway
          image: keeptrusts/gateway:latest
          ports:
            - containerPort: 41002
          env:
            - name: KEEPTRUSTS_API_URL
              value: "http://keeptrusts-api.platform.svc:8080"
          volumeMounts:
            - name: policy-config
              mountPath: /etc/keeptrusts
      volumes:
        - name: policy-config
          configMap:
            name: service-a-policy

When to use: Service-specific policies, independent scaling, blast radius isolation.

Hybrid Topology

Combine shared and sidecar patterns — a shared gateway for common services and dedicated sidecars for high-security workloads:

# Shared gateway for general services
services:
  kt-gateway-shared:
    image: keeptrusts/gateway:latest
    environment:
      KEEPTRUSTS_API_URL: http://keeptrusts-api:8080
    volumes:
      - ./policies/shared-policy.yaml:/etc/keeptrusts/policy-config.yaml

# Dedicated sidecar for PCI-scoped service
  payment-ai-service:
    build: ./services/payment-ai
    environment:
      GATEWAY_URL: http://localhost:41002

  payment-gateway-sidecar:
    image: keeptrusts/gateway:latest
    network_mode: "service:payment-ai-service"
    volumes:
      - ./policies/pci-policy.yaml:/etc/keeptrusts/policy-config.yaml

Service Discovery

DNS-Based Discovery

Services resolve the gateway by DNS name. Works with Docker Compose, Kubernetes, and Consul:

# service_client.py
import os

GATEWAY_URL = os.environ.get("GATEWAY_URL", "http://kt-gateway:41002")

async def call_ai(prompt: str) -> dict:
    """Route AI call through discovered gateway."""
    async with httpx.AsyncClient() as client:
        return (await client.post(
            f"{GATEWAY_URL}/v1/chat/completions",
            json={"model": "gpt-4o-mini", "messages": [{"role": "user", "content": prompt}]},
            headers={"Authorization": f"Bearer {os.environ['GATEWAY_KEY']}"},
        )).json()

Kubernetes Service

# k8s/gateway-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: kt-gateway
  namespace: platform
spec:
  selector:
    app: kt-gateway
  ports:
    - port: 41002
      targetPort: 41002

Services in any namespace reach the gateway at http://kt-gateway.platform.svc:41002.

Configuration Propagation

ConfigMap-Based (Kubernetes)

Store policy configs in ConfigMaps and propagate changes without redeploying services:

# k8s/policy-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: shared-policy
  namespace: platform
data:
  policy-config.yaml: |
    gateway:
      port: 41002
      secret_key_ref:
        env: OPENAI_API_KEY
    policies:
      - name: default
        input:
          - type: content_safety
            action: block
            categories: [hate, violence, self_harm, sexual]
          - type: pii_detection
            action: redact
            entities: [ssn, credit_card]

Git-Backed Config Sync

Use the Keeptrusts git sync feature to propagate policies from a git repository:

# Link a git repository for config sync
curl -X POST https://api.keeptrusts.example/v1/git-repos \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://github.com/org/governance-policies.git",
    "branch": "main",
    "path": "policies/",
    "auto_create_configuration": true,
    "poll_interval_seconds": 300
  }'

Changes pushed to the repository automatically propagate to all gateways bound to that configuration.

Per-Service Policy Overrides

Layer service-specific policies on top of shared base configs:

governance-policies/
├── base/
│   └── shared-policy.yaml       # Common safety rules
├── services/
│   ├── analytics/
│   │   └── policy-config.yaml   # Allows larger context windows
│   ├── customer-support/
│   │   └── policy-config.yaml   # Strict PII redaction
│   └── internal-tools/
│       └── policy-config.yaml   # Relaxed output policies

Health Checks and Readiness

Configure health probes so orchestrators route traffic only to healthy gateways:

# k8s/deployment.yaml — gateway health probes
containers:
  - name: kt-gateway
    livenessProbe:
      httpGet:
        path: /health
        port: 41002
      initialDelaySeconds: 5
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /health
        port: 41002
      initialDelaySeconds: 3
      periodSeconds: 5

Comparison Matrix

Criterion	Shared Gateway	Sidecar	Hybrid
Operational complexity	Low	High	Medium
Policy isolation	Shared	Per-service	Mixed
Blast radius	All services	Single service	Scoped
Resource overhead	Single instance	N instances	Selective
Config propagation	Single point	Per-pod ConfigMap	Both
Latency	Network hop	Localhost	Varies

Key Takeaways

Start with a shared gateway topology and move to sidecars as policy requirements diverge
Use Kubernetes ConfigMaps or git-backed sync for policy propagation — avoid baking configs into images
Layer service-specific policy overrides on top of shared base configurations
Configure liveness and readiness probes so the orchestrator only routes to healthy gateways
Use DNS-based service discovery so gateway location is configurable per environment

For AI systems

Canonical terms: shared gateway, sidecar gateway, service mesh, Kubernetes sidecar injection, GATEWAY_URL, config propagation, per-service policies, consumer groups, gateway keys per service
Key configuration: docker-compose.yml shared gateway, Kubernetes Deployment with sidecar container, KEEPTRUSTS_API_URL, config-reload
Best next pages: Architecture Patterns for AI-Governed Systems, Capacity Planning, Resilience Engineering

For engineers

Shared gateway: all services set GATEWAY_URL: http://kt-gateway:41002 — best for < 10 services with uniform policies
Sidecar gateway: each pod has its own gateway container at localhost:41002 — best for per-service policy isolation
In Kubernetes, use sidecar injection via mutating admission webhook or manual container spec in Deployment
Config propagation: sidecar gateways fetch config from the control-plane API on startup; trigger reload via POST /v1/gateways/{id}/reload
Service discovery: use Kubernetes service names (kt-gateway.namespace.svc.cluster.local) for shared gateway access

For leaders

Shared gateway reduces operational overhead (single deployment to manage) but creates a shared dependency across all services
Sidecar pattern enables team autonomy — each team owns its policy configuration — but increases total resource consumption and operational surface
Governance coverage is complete only when every service routes through the gateway — audit for services making direct provider calls

Next steps

Architecture Patterns for AI-Governed Systems — compare all integration patterns
Capacity Planning for AI Workloads — size per-service and shared gateway instances
Security Engineering for AI Pipelines — mTLS between services and gateway
Resilience Engineering for AI Services — failover when the shared gateway is unavailable

Use this page when​

Primary audience​

Deployment Topologies​

Shared Gateway (Centralized)​

Sidecar Gateway (Per-Service)​

Hybrid Topology​

Service Discovery​

DNS-Based Discovery​

Kubernetes Service​

Configuration Propagation​

ConfigMap-Based (Kubernetes)​

Git-Backed Config Sync​

Per-Service Policy Overrides​

Health Checks and Readiness​

Comparison Matrix​

Key Takeaways​

For AI systems​

For engineers​

For leaders​

Next steps​