Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Debugging AI Requests with Events

Every request through the Keeptrusts gateway produces a decision event containing the full lifecycle: policy evaluation, upstream latency, token usage, and the final decision. This guide shows you how to use kt events tail, the console Events page, and filtering to diagnose issues fast.

Use this page when

  • You need to debug why an AI request was blocked, slow, or returned unexpected results.
  • You want to use kt events tail for real-time event streaming during development.
  • You are filtering decision events by token name, model, status, or decision type.
  • You need to trace end-to-end latency and distinguish gateway overhead from provider latency.

Primary audience

  • Primary: Developers debugging AI request issues in development and staging
  • Secondary: SRE Engineers investigating production incidents, Platform Engineers monitoring gateway health

Decision Event Structure

Each event captures the complete request lifecycle:

{
"id": "evt_a1b2c3d4",
"timestamp": "2026-04-23T14:30:12Z",
"model": "gpt-4o",
"provider": "openai",
"decision": "allowed",
"policies_evaluated": [
{"name": "block-pii-output", "result": "pass"},
{"name": "max-tokens", "result": "pass"},
{"name": "log-all", "result": "logged"}
],
"latency_ms": 842,
"upstream_latency_ms": 780,
"tokens": {"prompt": 156, "completion": 89, "total": 245},
"token_name": "app-production",
"status_code": 200
}

Key Fields

FieldDescription
decisionallowed, blocked, escalated, or modified
policies_evaluatedList of policies and their individual results
latency_msTotal round-trip time (gateway overhead + upstream)
upstream_latency_msTime spent waiting for the LLM provider
token_nameWhich API or gateway key was used
status_codeHTTP status returned to the client

Using kt events tail

The CLI provides a real-time event stream for debugging:

Basic Tail

kt events tail

Output streams events as they arrive:

14:30:12 [allowed] gpt-4o 842ms 245tok app-production
14:30:15 [blocked] gpt-4o 12ms 0tok dev-prototyping → block-pii-output
14:30:18 [allowed] gpt-4o-mini 356ms 128tok frontend-key

Filtering Events

Filter by decision type:

# Only blocked requests
kt events tail --filter "decision=blocked"

# Only a specific model
kt events tail --filter "model=gpt-4o"

# Only a specific token
kt events tail --filter "token_name=app-production"

# Combine filters
kt events tail --filter "decision=blocked,model=gpt-4o"

Limiting Output

# Last 20 events
kt events tail --limit 20

# Events from the last hour
kt events tail --since 1h

JSON Output for Scripting

kt events tail --limit 5 --output json | jq '.[] | {decision, model, latency_ms}'
{"decision": "allowed", "model": "gpt-4o", "latency_ms": 842}
{"decision": "blocked", "model": "gpt-4o", "latency_ms": 12}
{"decision": "allowed", "model": "gpt-4o-mini", "latency_ms": 356}

Console Events Page

The management console provides a visual Events page with advanced filtering.

  1. Open the console at your deployment URL.
  2. Click Events in the sidebar.
  3. The page displays recent events with sortable columns.

Filtering in the Console

Use the filter bar to narrow results:

  • Decision: allowed, blocked, escalated
  • Model: Select from models in use
  • Time range: Last hour, last 24h, last 7 days, or custom
  • Token: Filter by token name
  • Policy: Filter by triggering policy name

Event Detail View

Click any event row to see the full detail:

  • Request summary: model, provider, timestamp
  • Policy evaluation chain: each policy's result in order
  • Timing breakdown: gateway overhead vs. upstream latency
  • Token usage: prompt, completion, and total tokens
  • Error details: for blocked or failed requests

Debugging Common Issues

"Why Was My Request Blocked?"

kt events tail --filter "decision=blocked" --limit 5

Check the policies_evaluated array to find which policy triggered:

{
"decision": "blocked",
"policies_evaluated": [
{"name": "block-prompt-injection", "result": "blocked", "reason": "pattern match: ignore previous"}
]
}

Fix: Adjust the policy pattern or rephrase the prompt.

"Why Is My Request Slow?"

Compare latency_ms and upstream_latency_ms:

kt events tail --limit 10 --output json | \
jq '.[] | {model, total: .latency_ms, upstream: .upstream_latency_ms, overhead: (.latency_ms - .upstream_latency_ms)}'
{"model": "gpt-4o", "total": 842, "upstream": 780, "overhead": 62}
{"model": "gpt-4o", "total": 2340, "upstream": 2290, "overhead": 50}

If overhead is high (>200ms), check:

  • Number of active policies (each adds evaluation time)
  • Knowledge base injection size
  • Network latency between gateway and API

If upstream is high, the provider is slow — consider switching models or using a fallback chain.

"Why Am I Getting 401 Errors?"

kt events tail --filter "status_code=401" --limit 5

Common causes:

  • Expired gateway key — check kt tokens inspect --name "your-key"
  • Invalid API key in provider config — verify secret_key_ref is set
  • Revoked token — list active tokens with kt tokens list

"My Knowledge Base Isn't Being Used"

Check the event detail for knowledge base injection:

kt events tail --limit 1 --output json | jq '.[0].knowledge_assets_injected'

If empty, verify:

  1. The asset is promoted: kt knowledge-base list
  2. The asset is bound in config: check knowledge_base.assets in your YAML
  3. The gateway reloaded after config change

Trace Correlation

When running multiple services, correlate events across the pipeline:

# Find events for a specific request ID
kt events tail --filter "request_id=req_xyz789"

Adding Custom Trace IDs

Pass a trace header through the gateway:

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
extra_headers={"X-Request-ID": "my-trace-id-123"},
)
# Find the event by your custom trace ID
kt events tail --filter "request_id=my-trace-id-123"

Latency Analysis

Histogram of Recent Latencies

kt events tail --limit 100 --output json | \
jq -r '.[].latency_ms' | \
awk '{
if ($1 < 500) bucket="<500ms"
else if ($1 < 1000) bucket="500ms-1s"
else if ($1 < 2000) bucket="1s-2s"
else bucket=">2s"
print bucket
}' | sort | uniq -c | sort -rn

P95 Latency

kt events tail --limit 100 --output json | \
jq -r '.[].latency_ms' | sort -n | \
awk 'NR==int(NR*0.95) {print "P95:", $1, "ms"}'

Best Practices

PracticeWhy
Use kt events tail during developmentReal-time feedback on policy behavior
Filter by token name in productionIsolate traffic from specific applications
Compare gateway vs. upstream latencyDistinguish policy overhead from provider slowness
Add X-Request-ID headersEnables end-to-end trace correlation
Check events after config changesVerify new policies are evaluated correctly
Export events for offline analysis--output json pipes into jq, pandas, etc.

Next steps

For AI systems

  • Canonical terms: decision event, kt events tail, filtering, latency_ms, upstream_latency_ms, token_name, policies_evaluated, X-Request-ID, event stream.
  • CLI commands: kt events tail (real-time), kt events tail --decision blocked (filter), kt events tail --token dev-key (by key), kt events tail --output json (machine-readable).
  • Key event fields: decision (allowed/blocked/escalated/modified), policies_evaluated, latency_ms, upstream_latency_ms, token_name, status_code.
  • Best next pages: Testing AI Code, API Key Management, Multi-Model Routing.

For engineers

  • Use kt events tail during development for real-time feedback on policy behavior.
  • Filter by --decision blocked to isolate policy violations; filter by --token to trace specific application traffic.
  • Compare latency_ms vs upstream_latency_ms to distinguish gateway policy overhead from provider slowness.
  • Add X-Request-ID headers in your application to enable end-to-end trace correlation with decision events.
  • Use --output json | jq for programmatic analysis and P95 latency calculations.
  • Check events immediately after config changes to verify new policies are being evaluated correctly.

For leaders

  • Decision events provide complete observability without requiring application-level instrumentation.
  • Event-based debugging reduces mean time to resolution (MTTR) for AI-related production incidents.
  • Latency decomposition (gateway vs. provider) identifies whether issues are governance overhead or provider problems.
  • Event filtering by token name enables per-application traffic isolation for targeted troubleshooting.
  • All debugging data is audit-trail quality — the same events used for debugging serve compliance reporting.