Skip to main content

Observability

The Agent SDK makes every request observable by default. Policy decisions, costs, and traces are recorded automatically — your agent reads them back through typed helpers.

How observability flows

Agent Runtime

├─ chat() ─────────────────────────────────► Gateway
│ Headers: │
│ x-keeptrusts-agent-id: agt_abc123 │
│ x-request-id: req_550e8400 ├─ Policy chain
│ traceparent: 00-trace-span-01 │
│ │
│ ▼
│ Decision Event
│ {
│ event_id,
│ request_id,
│ agent_id,
│ policy_outcome,
│ event_cost_attribution,
│ input_tokens,
│ output_tokens,
│ }

├─ listEvents() ──────────────────────────► Control-Plane API
│ │
│ ▼
│ Event records

└─ getStats() ────────────────────────────► Agent-scoped stats

Decision events

Every gateway request produces a decision event that records:

FieldDescription
event_idUnique event identifier
request_idThe x-request-id from the original request
agent_idThe x-keeptrusts-agent-id that made the request
policy_outcomeallowed, blocked, redacted, or escalated
event_cost_attributionCost breakdown (input, output, total)
source_spend_log_idStable reference to the spend ledger entry
input_tokensToken count for the input
output_tokensToken count for the output
modelThe model used for inference
providerThe upstream provider (e.g., openai, anthropic)
created_atISO timestamp

Query events by request

const events = await agent.listEvents({ requestId: result.requestId });

Query events by time window

const events = await agent.listEvents({
since: "2026-05-31T00:00:00Z",
until: "2026-05-31T23:59:59Z",
limit: 100,
});

Query events by outcome

const blocked = await agent.listEvents({
outcome: "blocked",
since: new Date(Date.now() - 86400_000).toISOString(),
});

Full event detail

const detail = await agent.getEvent(events[0].event_id);
console.log(detail.event_cost_attribution);
console.log(detail.policies_evaluated);
console.log(detail.redactions_applied);

Cost attribution

Per-request costs

Every decision event includes an event_cost_attribution object:

const events = await agent.listEvents({ requestId });
const cost = events[0].event_cost_attribution;

console.log(`Input cost: ${cost.input_cost}`);
console.log(`Output cost: ${cost.output_cost}`);
console.log(`Total cost: ${cost.total_cost}`);
console.log(`Spend log: ${cost.source_spend_log_id}`);

Aggregate costs (agent-scoped)

const stats = await agent.getStats();
console.log(`Total spend: ${stats.total_cost}`);
console.log(`Request count: ${stats.total_requests}`);
console.log(`Avg cost/request: ${stats.average_cost_per_request}`);

Spend attribution rule

important

Always use event_cost_attribution from decision events for per-request cost tracking. Do not poll wallet APIs for request-level spend.

  • event_cost_attribution → per-request cost (use this)
  • wallet APIs → balance and transaction workflows (not per-request)

Distributed tracing

The SDK automatically generates and propagates W3C Trace Context headers:

traceparent: 00-<trace-id>-<span-id>-01

How trace propagation works

  1. SDK generates traceparent if not already present
  2. Header is sent with the gateway request
  3. Gateway preserves it through the policy chain and upstream call
  4. Decision event records the trace context
  5. Your observability backend (Jaeger, Grafana Tempo, etc.) can stitch the full trace

Bring your own trace

If your runtime already has a trace context:

const result = await agent.chat({
model: "gpt-5.4-mini",
messages,
traceparent: "00-abcdef1234567890abcdef1234567890-1234567890abcdef-01",
});

Disable tracing

const agent = createAgentRuntime({
// ...
trace: false, // disables traceparent generation
});

Agent stats

Get aggregate statistics for the registered agent:

const stats = await agent.getStats();

Response:

{
"agent_id": "agt_abc123",
"total_requests": 1247,
"total_cost": "3.847",
"average_cost_per_request": "0.003085",
"policy_blocks": 12,
"escalations": 3,
"models_used": ["gpt-5.4-mini", "gpt-5.4-mini-mini", "claude-sonnet-4-20250514"],
"period": {
"start": "2026-05-01T00:00:00Z",
"end": "2026-05-31T23:59:59Z"
}
}

Agent actions

List actions taken by or on behalf of the agent:

const actions = await agent.listActions();

Actions include:

  • Registration events
  • Deployment changes
  • Gateway link/unlink operations
  • Configuration updates
  • Policy override requests
[
{
"action_id": "act_001",
"type": "deployment_update",
"description": "Status changed to active, version 2.1.0",
"created_at": "2026-05-31T14:30:00Z"
},
{
"action_id": "act_002",
"type": "gateway_linked",
"description": "Linked to gateway-prod-us-east",
"created_at": "2026-05-31T14:25:00Z"
}
]

Event callbacks

React to events in real-time with the onEvent callback:

const agent = createAgentRuntime({
// ...
onEvent: (event) => {
if (event.policy_outcome === "blocked") {
alertOps(`Agent request blocked: ${event.event_id}`);
}
metrics.record("agent.request.cost", parseFloat(event.event_cost_attribution.total_cost));
},
});

Integration with observability platforms

OpenTelemetry

The traceparent propagation is compatible with any OpenTelemetry-based backend:

import { trace } from "@opentelemetry/api";

const tracer = trace.getTracer("keeptrusts-agent");
const span = tracer.startSpan("agent.chat");

const result = await agent.chat({
model: "gpt-5.4-mini",
messages,
traceparent: `00-${span.spanContext().traceId}-${span.spanContext().spanId}-01`,
});

span.end();

Datadog / Grafana / Jaeger

Use the traceparent from any request to correlate Keeptrusts decision events with your existing distributed traces:

// Your existing trace context
const existingTraceparent = req.headers["traceparent"];

const result = await agent.chat({
model: "gpt-5.4-mini",
messages,
traceparent: existingTraceparent,
});

// Same trace ID now appears in:
// 1. Your observability platform (Datadog, Grafana, Jaeger)
// 2. Keeptrusts decision events
// 3. Keeptrusts trail records