Service Mesh Integration for AI Gateway
Service meshes provide traffic management, mutual TLS, and observability at the infrastructure layer. This guide covers integrating the Keeptrusts gateway with Istio and Linkerd, routing AI traffic through the mesh, enforcing mTLS between services, leveraging mesh observability, and running the gateway as a mesh-aware sidecar.
Use this page when
- You are integrating the Keeptrusts gateway with Istio or Linkerd service meshes
- You need to configure mTLS, VirtualService routing, or AuthorizationPolicy for AI traffic
- You want to run the gateway as a mesh-aware sidecar alongside application containers
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Why Service Mesh + AI Gateway?
The service mesh and the AI gateway operate at different layers:
| Layer | Service Mesh (Istio/Linkerd) | AI Gateway (Keeptrusts) |
|---|---|---|
| Transport | mTLS, retries, circuit breaking | N/A |
| Routing | Traffic splitting, canary, fault injection | Provider routing |
| Policy | Network-level AuthorizationPolicy | Content-level AI policies |
| Observability | Request metrics, distributed traces | Decision events, cost tracking |
They are complementary. The mesh secures the transport; the gateway governs the content.
Istio Integration
Gateway Deployment in the Mesh
Annotate the Keeptrusts gateway for Istio sidecar injection:
# gateway-deployment-istio.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: keeptrusts-gateway
namespace: keeptrusts
labels:
app: keeptrusts-gateway
spec:
replicas: 3
selector:
matchLabels:
app: keeptrusts-gateway
template:
metadata:
labels:
app: keeptrusts-gateway
annotations:
sidecar.istio.io/inject: "true"
spec:
containers:
- name: gateway
image: keeptrusts/gateway:latest
ports:
- containerPort: 41002
name: http
env:
- name: KEEPTRUSTS_API_URL
value: "https://keeptrusts-api.internal:8080"
- name: KEEPTRUSTS_GATEWAY_TOKEN
valueFrom:
secretKeyRef:
name: keeptrusts-secrets
key: api-key
volumeMounts:
- name: config
mountPath: /etc/keeptrusts
readOnly: true
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
volumes:
- name: config
configMap:
name: keeptrusts-policy-config
VirtualService for Traffic Routing
Route AI traffic through the gateway with Istio VirtualService:
# virtualservice-gateway.yaml
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: keeptrusts-gateway
namespace: keeptrusts
spec:
hosts:
- keeptrusts-gateway.keeptrusts.svc.cluster.local
http:
- match:
- uri:
prefix: /v1/chat/completions
route:
- destination:
host: keeptrusts-gateway.keeptrusts.svc.cluster.local
port:
number: 41002
timeout: 120s
retries:
attempts: 2
perTryTimeout: 60s
retryOn: "5xx,reset,connect-failure"
- match:
- uri:
prefix: /health
route:
- destination:
host: keeptrusts-gateway.keeptrusts.svc.cluster.local
port:
number: 41002
Canary Deployments for Policy Changes
Use Istio traffic splitting to canary new policy configurations:
# canary-policy-update.yaml
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: keeptrusts-gateway-canary
namespace: keeptrusts
spec:
hosts:
- keeptrusts-gateway.keeptrusts.svc.cluster.local
http:
- route:
- destination:
host: keeptrusts-gateway.keeptrusts.svc.cluster.local
subset: stable
weight: 90
- destination:
host: keeptrusts-gateway.keeptrusts.svc.cluster.local
subset: canary
weight: 10
---
apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
name: keeptrusts-gateway
namespace: keeptrusts
spec:
host: keeptrusts-gateway.keeptrusts.svc.cluster.local
subsets:
- name: stable
labels:
version: stable
- name: canary
labels:
version: canary
Deploy the canary with the new policy config and gradually increase the weight as confidence grows.
AuthorizationPolicy
Restrict which namespaces can send traffic to the gateway:
# authz-policy.yaml
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: keeptrusts-gateway-authz
namespace: keeptrusts
spec:
selector:
matchLabels:
app: keeptrusts-gateway
action: ALLOW
rules:
- from:
- source:
namespaces:
- applications
- data-science
- customer-support
to:
- operation:
ports: ["41002"]
mTLS Configuration
Strict mTLS
Enforce mutual TLS for all traffic to and from the gateway:
# peer-authentication.yaml
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: keeptrusts-strict-mtls
namespace: keeptrusts
spec:
selector:
matchLabels:
app: keeptrusts-gateway
mtls:
mode: STRICT
DestinationRule for mTLS to API
Ensure traffic from the gateway to the Keeptrusts API also uses mTLS:
apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
name: keeptrusts-api-mtls
namespace: keeptrusts
spec:
host: keeptrusts-api.keeptrusts.svc.cluster.local
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
Linkerd Integration
Annotate for Linkerd Injection
# gateway-deployment-linkerd.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: keeptrusts-gateway
namespace: keeptrusts
annotations:
linkerd.io/inject: enabled
spec:
replicas: 3
selector:
matchLabels:
app: keeptrusts-gateway
template:
metadata:
labels:
app: keeptrusts-gateway
annotations:
linkerd.io/inject: enabled
spec:
containers:
- name: gateway
image: keeptrusts/gateway:latest
ports:
- containerPort: 41002
name: http
env:
- name: KEEPTRUSTS_API_URL
value: "https://keeptrusts-api.internal:8080"
- name: KEEPTRUSTS_GATEWAY_TOKEN
valueFrom:
secretKeyRef:
name: keeptrusts-secrets
key: api-key
volumeMounts:
- name: config
mountPath: /etc/keeptrusts
readOnly: true
volumes:
- name: config
configMap:
name: keeptrusts-policy-config
Linkerd ServiceProfile
Define a ServiceProfile for fine-grained routing and retries:
# service-profile.yaml
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
name: keeptrusts-gateway.keeptrusts.svc.cluster.local
namespace: keeptrusts
spec:
routes:
- name: chat-completions
condition:
method: POST
pathRegex: /v1/chat/completions
responseClasses:
- condition:
status:
min: 500
max: 599
isFailure: true
timeout: 120s
- name: health
condition:
method: GET
pathRegex: /health
timeout: 5s
Observability Mesh
Mesh Metrics
With the Istio or Linkerd sidecar, you get request-level metrics for the gateway automatically:
# Request rate to the gateway
rate(istio_requests_total{destination_service="keeptrusts-gateway.keeptrusts.svc.cluster.local"}[5m])
# P99 latency
histogram_quantile(0.99,
rate(istio_request_duration_milliseconds_bucket{
destination_service="keeptrusts-gateway.keeptrusts.svc.cluster.local"
}[5m])
)
# Error rate
sum(rate(istio_requests_total{
destination_service="keeptrusts-gateway.keeptrusts.svc.cluster.local",
response_code=~"5.."
}[5m]))
Grafana Dashboard
Create a dashboard combining mesh metrics with Keeptrusts decision events:
{
"panels": [
{
"title": "Gateway Request Rate",
"type": "timeseries",
"targets": [
{
"expr": "rate(istio_requests_total{destination_service=~\"keeptrusts-gateway.*\"}[5m])"
}
]
},
{
"title": "Policy Decisions",
"type": "piechart",
"targets": [
{
"expr": "sum by (decision) (keeptrusts_policy_decisions_total)"
}
]
},
{
"title": "P99 Latency",
"type": "gauge",
"targets": [
{
"expr": "histogram_quantile(0.99, rate(istio_request_duration_milliseconds_bucket{destination_service=~\"keeptrusts-gateway.*\"}[5m]))"
}
]
}
]
}
Distributed Tracing
Istio propagates trace headers (B3, W3C TraceContext) through the mesh. The Keeptrusts gateway preserves these headers, enabling end-to-end traces from the calling application through the gateway to the upstream LLM provider:
Application → [Istio proxy] → Keeptrusts Gateway → [Istio proxy] → LLM Provider
│ │ │
└──────── trace-id: abc123 ────┴────────────────────────────────────┘
Configure your tracing backend (Jaeger, Tempo, Zipkin) to ingest spans from both the mesh proxies and the Keeptrusts gateway for a unified view of AI request flows.
Gateway as Mesh-Aware Sidecar
For maximum locality, inject the gateway alongside application pods within the mesh:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-ai-app
annotations:
sidecar.istio.io/inject: "true"
spec:
template:
spec:
containers:
- name: app
image: my-ai-app:latest
env:
- name: OPENAI_BASE_URL
value: "http://localhost:41002/v1"
- name: keeptrusts-gateway
image: keeptrusts/gateway:latest
ports:
- containerPort: 41002
env:
- name: KEEPTRUSTS_API_URL
value: "https://keeptrusts-api.internal:8080"
- name: KEEPTRUSTS_GATEWAY_TOKEN
valueFrom:
secretKeyRef:
name: keeptrusts-secrets
key: api-key
volumeMounts:
- name: config
mountPath: /etc/keeptrusts
readOnly: true
volumes:
- name: config
configMap:
name: keeptrusts-policy-config
In this topology, the pod runs three containers: the application, the Keeptrusts gateway, and the Istio/Linkerd proxy. The application talks to localhost for AI governance; the mesh proxy handles mTLS and observability for all outbound traffic. Each layer does what it does best.
For AI systems
- Canonical terms: service mesh, Istio, Linkerd, mTLS, VirtualService, AuthorizationPolicy, sidecar injection, mesh-aware, zero-trust
- Istio annotation:
sidecar.istio.io/inject: "true"on gateway pod spec - Layer separation: mesh handles transport security (mTLS, retries, circuit breaking); gateway handles content-level AI policy enforcement
- Sidecar topology: application pod runs app + gateway + mesh proxy (3 containers)
- Related pages: Kubernetes Deployment, Monitoring & Alerting, Multi-Tenant Gateway
For engineers
- Annotate the gateway Deployment with
sidecar.istio.io/inject: "true"for automatic Istio proxy injection - Create a VirtualService to route AI traffic (
/v1/chat/completions,/v1/completions) to the gateway service - Define an AuthorizationPolicy to restrict which namespaces can access the gateway (zero-trust)
- For the sidecar pattern: add the gateway container to your application pod, route to
localhost:41002, and let the mesh proxy handle outbound mTLS - Linkerd: use
linkerd.io/inject: enabledannotation and ServiceProfile for retry/timeout policy - Validate: confirm mTLS is active with
istioctl proxy-statusand verify gateway metrics appear in mesh observability
For leaders
- Service mesh + AI gateway provides defense-in-depth: transport-level security (mTLS) plus content-level governance
- Zero-trust networking via AuthorizationPolicy ensures only authorized services can reach the AI gateway
- Mesh observability (request metrics, distributed traces) complements gateway decision events for full-stack visibility
- Sidecar topology adds per-pod governance with zero network hops — ideal for latency-sensitive, high-security workloads
- No code changes required in applications — mesh and gateway are infrastructure-layer concerns
Next steps
- Deploy the gateway with Kubernetes Deployment manifests and Helm charts
- Set up Monitoring & Alerting using both mesh metrics and gateway Prometheus endpoint
- Manage per-team gateways with Multi-Tenant Gateway fleet patterns