DevOps Guide: Operating the AI Gateway in Production
The Keeptrusts gateway is a mission-critical component in your AI infrastructure — every LLM request flows through it. This guide covers production deployment patterns, monitoring, alerting, scaling, and operational runbooks for DevOps engineers.
Use this page when
- You are deploying Keeptrusts gateways to production (Docker, Kubernetes, or bare metal)
- You need to configure health checks, monitoring, and alerting for gateway infrastructure
- You are scaling gateway instances behind a load balancer
- You need operational runbooks for gateway upgrades, rollbacks, and incident response
- You are automating gateway deployment with CI/CD pipelines and infrastructure as code
Primary audience
- Primary: Technical Engineers (DevOps Engineers, SREs, Infrastructure Engineers)
- Secondary: Platform Engineers, Cloud Architects, Security Engineers
Deployment Architecture
Single Gateway (Development / Small Teams)
# Start the gateway directly
kt gateway run \
--config policy-config.yaml \
--port 41002
Docker Deployment
# Gateway container
FROM keeptrusts/gateway:latest
COPY policy-config.yaml /etc/keeptrusts/policy-config.yaml
ENV KEEPTRUSTS_API_URL=https://api.keeptrusts.com
ENV KEEPTRUSTS_GATEWAY_TOKEN=${GATEWAY_TOKEN}
EXPOSE 41002
CMD ["kt", "gateway", "run", "--config", "/etc/keeptrusts/policy-config.yaml", "--port", "41002"]
# docker-compose.yml
services:
keeptrusts-gateway:
image: keeptrusts/gateway:latest
ports:
- "41002:41002"
volumes:
- ./policy-config.yaml:/etc/keeptrusts/policy-config.yaml:ro
environment:
KEEPTRUSTS_API_URL: http://keeptrusts-api:8080
OPENAI_API_KEY: ${OPENAI_API_KEY}
restart: unless-stopped
healthcheck:
test: ["CMD", "kt", "doctor"]
interval: 30s
timeout: 10s
retries: 3
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: keeptrusts-gateway
spec:
replicas: 3
selector:
matchLabels:
app: keeptrusts-gateway
template:
metadata:
labels:
app: keeptrusts-gateway
spec:
containers:
- name: gateway
image: keeptrusts/gateway:latest
ports:
- containerPort: 41002
livenessProbe:
exec:
command: ["kt", "doctor"]
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
exec:
command: ["kt", "doctor"]
initialDelaySeconds: 5
periodSeconds: 10
env:
- name: KEEPTRUSTS_API_URL
valueFrom:
secretKeyRef:
name: keeptrusts-secrets
key: api-url
volumeMounts:
- name: config
mountPath: /etc/keeptrusts
readOnly: true
volumes:
- name: config
configMap:
name: keeptrusts-gateway-config
Health Checks and Diagnostics
Gateway Health
# Comprehensive health check
kt doctor
# Quick connectivity test
kt events list --since 1h --limit 1
# Validate configuration without restarting
kt policy lint --file policy-config.yaml
What kt doctor Checks
| Check | What it validates |
|---|---|
| Configuration syntax | YAML parsing and schema validation |
| Provider connectivity | API keys and endpoint reachability |
| Control plane connection | API URL and authentication |
| Policy chain integrity | All referenced policies are valid |
| Event pipeline | Events can be submitted to the API |
Monitoring and Observability
Key Metrics to Monitor
| Metric | Source | Alert threshold |
|---|---|---|
| Gateway request latency (p99) | Gateway metrics | > 2s |
| Error rate | Events with status=error | > 5% |
| Policy evaluation time | Gateway metrics | > 500ms |
| Event submission failures | Gateway logs | > 0 sustained |
| Active connections | Gateway metrics | > 80% capacity |
| Configuration age | Last config reload timestamp | > 24h without refresh |
Event Pipeline Monitoring
# Verify events are flowing
kt events list --since 5m --limit 5
# Tail events in real-time for debugging
kt events tail
# Check event submission from the API side
curl -H "Authorization: Bearer $API_TOKEN" \
"https://api.keeptrusts.com/v1/events?since=5m&limit=5"
Log Aggregation
The gateway emits structured logs compatible with standard log aggregation tools. Forward these to your existing logging pipeline:
# Example: Docker logging driver
services:
keeptrusts-gateway:
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
Alerting Rules
Critical Alerts (Page)
| Condition | Action |
|---|---|
| Gateway unreachable for > 2 minutes | Page on-call, check container health |
| Event submission failures > 10 in 5 minutes | Page on-call, check API connectivity |
| Error rate > 10% for 5 minutes | Page on-call, check upstream providers |
Warning Alerts (Ticket)
| Condition | Action |
|---|---|
| P99 latency > 2s for 15 minutes | Create ticket, investigate provider latency |
| Configuration not refreshed in 24h | Create ticket, check git sync |
| Disk usage > 80% on gateway host | Create ticket, rotate logs |
Scaling Strategies
Horizontal Scaling
Deploy multiple gateway instances behind a load balancer. The gateway is stateless — all state flows through the control-plane API.
# Kubernetes HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: keeptrusts-gateway-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: keeptrusts-gateway
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Configuration Management at Scale
Use Git-backed configuration sync for consistent policy deployment across all gateway instances:
- Store policy configs in a Git repository
- Link the repository in Console Settings > Git Repositories
- Changes merged to the default branch automatically sync to all gateways
# Verify the current running configuration
kt policy lint --file policy-config.yaml
Rollback Procedures
Configuration Rollback
If a policy change causes issues:
# Validate the previous config version
kt policy lint --file policy-config-previous.yaml
# Redeploy with the previous config
kt gateway run --policy-config policy-config-previous.yaml --port 41002
With Git-backed configs, revert the commit and the sync will pick up the previous version automatically.
Full Gateway Rollback
For container deployments, roll back to the previous image version:
# Kubernetes rollback
kubectl rollout undo deployment/keeptrusts-gateway
# Docker rollback
docker compose up -d --no-deps keeptrusts-gateway
Operational Runbooks
Gateway Not Responding
- Check container status:
docker psorkubectl get pods - Check logs:
docker logs keeptrusts-gatewayorkubectl logs -l app=keeptrusts-gateway - Run diagnostics:
kt doctor - Verify network connectivity to upstream providers
- Check control-plane API reachability
Events Not Appearing in Console
- Verify event pipeline:
kt events list --since 5m - Check API connectivity from gateway host
- Verify API token validity
- Check for rate limiting or quota exhaustion
High Latency
- Check upstream provider status pages
- Review p99 latency by provider: filter events by provider in Console
- Check gateway resource utilization (CPU, memory)
- Verify network path between gateway and providers
Success Metrics for DevOps
| Metric | Target | Source |
|---|---|---|
| Gateway uptime | 99.9% | Health check monitoring |
| Mean time to deploy config change | Under 15 minutes | Deployment pipeline metrics |
| Event delivery success rate | > 99.9% | Event pipeline monitoring |
| Mean time to recovery | Under 30 minutes | Incident tracking |
| Configuration drift | Zero | configuration deployment verification |
Next steps
- Review deployment topologies: Architecture Overview
- Set up gateway fleet management: Platform Engineer Guide
- Configure monitoring: Gateway Monitoring
- Explore public runtime behavior: Gateway Runtime Features
For AI systems
- Canonical terms: Keeptrusts, gateway deployment, production operations, health checks, scaling, monitoring, alerting
- Key surfaces:
kt gateway run,kt doctor,kt policy lint, Docker Compose, Kubernetes Deployment/Service, Console Dashboard - Deployment patterns: single gateway (dev), Docker Compose (small teams), Kubernetes Deployment with replicas (production)
- Health check:
kt doctorused in Docker HEALTHCHECK and Kubernetes liveness/readiness probes - Environment variables:
KEEPTRUSTS_API_URL,KEEPTRUSTS_GATEWAY_TOKEN, provider key env vars - Best next pages: Architecture Overview, Platform Engineer Guide, Gateway Monitoring, Gateway Runtime Features
For engineers
- Start gateway:
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml - Docker health check:
["CMD", "kt", "doctor"]with 30s interval, 10s timeout, 3 retries - Kubernetes: deploy as
apps/v1 Deploymentwithreplicas: 3, liveness/readiness probes usingkt doctor - Validate config before deploy:
kt policy lint --file policy-config.yaml - Verify event flow:
kt events list --since 1h --limit 1 - Git-linked configurations auto-sync policy changes on merge to main branch
- Target SLO: 99.9% gateway uptime, with mean time to deploy config changes under 15 minutes
For leaders
- The gateway is a mission-critical path — every LLM request flows through it, so production deployment requires HA, health monitoring, and automated recovery
- Docker and Kubernetes deployment patterns provide scalability from single-instance dev to multi-replica production clusters
- Git-linked configuration sync enables infrastructure-as-code workflows where policy changes follow the same PR review process as application code
- Gateway operational metrics (uptime, event delivery, config drift) should be tracked alongside application SLOs