Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Observability for AI-Governed Systems

Every request through the Keeptrusts gateway produces structured telemetry — events, logs, metrics, and traces. This guide shows how to instrument your stack for full observability from application to LLM provider.

Use this page when

  • You are configuring structured logging, metrics collection, or OpenTelemetry integration for the gateway
  • You need to build Grafana dashboards for gateway performance and policy enforcement outcomes
  • You want to correlate application logs with gateway request IDs
  • You are setting up alerting on governance metrics (block rate, latency, error rate)

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Observability Architecture

Structured Logging

Gateway Log Format

The gateway emits structured JSON logs with consistent fields:

{
"timestamp": "2026-04-23T10:15:30.123Z",
"level": "info",
"target": "kt_gateway::server",
"message": "Request completed",
"request_id": "req_abc123",
"model": "gpt-4o",
"provider": "openai",
"status": 200,
"latency_ms": 1245,
"input_tokens": 150,
"output_tokens": 89,
"policies_applied": ["content-filter", "pii-redaction"],
"policy_action": "pass",
"cache_hit": false
}

Log Configuration

gateway:
logging:
# Log level: trace, debug, info, warn, error
level: info
# Output format: json or pretty
format: json
# Include request/response bodies (⚠️ sensitive data)
log_bodies: false
# Redact these fields from logs
redact_fields: [api_key, authorization]

Application-Side Logging

Correlate application logs with gateway request IDs:

import { randomUUID } from 'crypto';

async function callAI(messages: Message[]) {
const requestId = randomUUID();

console.log(JSON.stringify({
level: 'info',
message: 'Sending AI request',
request_id: requestId,
model: 'gpt-4o',
input_tokens_estimate: estimateTokens(messages),
}));

const response = await fetch('http://kt-gateway:41002/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-Request-ID': requestId,
},
body: JSON.stringify({ model: 'gpt-4o', messages }),
});

const data = await response.json();

console.log(JSON.stringify({
level: 'info',
message: 'AI request completed',
request_id: requestId,
status: response.status,
output_tokens: data.usage?.completion_tokens,
}));

return data;
}

Metrics Collection

Gateway Metrics Endpoint

The gateway exposes Prometheus-compatible metrics:

curl http://localhost:41002/metrics

Key Metrics

MetricTypeDescription
kt_requests_totalCounterTotal requests by provider, model, status
kt_request_duration_secondsHistogramRequest latency distribution
kt_policy_evaluations_totalCounterPolicy evaluations by name, action
kt_policy_duration_secondsHistogramPolicy evaluation latency
kt_tokens_totalCounterTokens processed (input/output)
kt_connections_activeGaugeActive upstream connections
kt_circuit_breaker_stateGaugeCircuit breaker state per provider
kt_cache_hits_totalCounterCache hits and misses

Prometheus Scrape Configuration

# prometheus.yml
scrape_configs:
- job_name: 'kt-gateway'
scrape_interval: 15s
static_configs:
- targets: ['kt-gateway:41002']
metrics_path: /metrics

Grafana Dashboard Panels

Key panels for your AI governance dashboard:

Row 1: Traffic Overview
- Requests/sec by provider (kt_requests_total rate)
- Error rate by provider (kt_requests_total{status=~"5.."})
- Active connections (kt_connections_active)

Row 2: Latency
- P50/P90/P99 latency (kt_request_duration_seconds)
- Policy evaluation latency (kt_policy_duration_seconds)
- Time to first byte for streaming

Row 3: Tokens and Cost
- Tokens/sec by model (kt_tokens_total rate)
- Estimated cost/hour
- Cache hit ratio (kt_cache_hits_total)

Row 4: Governance
- Policy actions (block, redact, pass)
- Circuit breaker states
- Escalation rate

OpenTelemetry Integration

Gateway OTLP Export

Configure the gateway to export spans via OTLP:

gateway:
telemetry:
otlp:
enabled: true
endpoint: http://otel-collector:4317
protocol: grpc
# Sampling rate (1.0 = 100%)
sample_rate: 0.1
# Additional resource attributes
resource_attributes:
service.name: kt-gateway
deployment.environment: production
service.version: "0.12.3"

OTel Collector Configuration

# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318

processors:
batch:
timeout: 5s
send_batch_size: 1024

attributes:
actions:
- key: api_key
action: delete
- key: authorization
action: delete

exporters:
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true

service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, attributes]
exporters: [otlp/jaeger]

Span Structure

Console Dashboard Correlation

Event-Based Observability

The console dashboard provides a unified view of all gateway events:

# Tail events in real time
kt events tail

# Filter events by status
kt events tail --filter "status=blocked"

# Filter by policy action
kt events tail --filter "policy_action=redact"

# Search events with full-text
kt events search "prompt injection" --last 24h

Event Fields for Debugging

Each event in the console contains:

FieldDescription
event_idUnique event identifier
request_idCorrelation ID from the original request
timestampWhen the request was processed
modelModel requested
providerProvider that served the request
statusHTTP status code
latency_msTotal request latency
input_tokensInput token count
output_tokensOutput token count
policies_appliedList of policies evaluated
policy_actionFinal policy action (pass/block/redact)
gateway_idWhich gateway processed the request

Console Debugging Workflow

Alerting Rules

Prometheus Alert Examples

groups:
- name: kt-gateway
rules:
- alert: HighErrorRate
expr: rate(kt_requests_total{status=~"5.."}[5m]) / rate(kt_requests_total[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "Gateway error rate > 5%"

- alert: HighLatency
expr: histogram_quantile(0.99, rate(kt_request_duration_seconds_bucket[5m])) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "P99 latency > 10s"

- alert: CircuitBreakerOpen
expr: kt_circuit_breaker_state == 1
for: 1m
labels:
severity: warning
annotations:
summary: "Circuit breaker open for {{ $labels.provider }}"

Log Aggregation Pipeline

Next steps

For AI systems

  • Canonical terms: structured JSON logs, request_id, Prometheus metrics, gateway.logging.level, gateway.logging.format, OTLP spans, OTel collector, console dashboard, kt events stats, policy_action, cache_hit
  • Key configuration: gateway.logging (level, format, log_bodies, redact_fields), Prometheus scrape config, OTel collector pipeline
  • Best next pages: Distributed Tracing, Performance Engineering, Incident Response

For engineers

  • Gateway emits structured JSON logs with request_id, model, provider, status, latency_ms, policies_applied, and policy_action
  • Set log_bodies: false in production to avoid logging sensitive request/response content
  • Correlate logs: include the same request_id in application-side logs before calling the gateway
  • Metrics: scrape Prometheus endpoint for keeptrusts_decisions_total, keeptrusts_request_duration_seconds, kt_tokens_consumed
  • Console dashboard provides pre-built panels for interaction volume, policy outcomes, provider mix, and cost trends

For leaders

  • Full observability stack (logs, metrics, traces, events) enables proactive governance monitoring rather than reactive incident response
  • Console dashboard provides executive-ready views of AI governance posture without requiring Grafana expertise
  • Structured logging with request_id correlation reduces mean time to resolution (MTTR) when investigating governance incidents