kt gateway run
Start the Keeptrusts policy enforcement gateway. The gateway sits between your application and AI providers, enforcing policies on every request and response.
Use this page when
- You need to start the Keeptrusts gateway locally with
kt gateway run. - You are configuring provider targets, fallback, routing strategies, or runtime identity modes.
- You need the exact flags, environment variables, or config patterns for the gateway runtime.
The recommended pattern is to define providers in policy-config.yaml and use kt gateway run to execute that contract.
For the broader gateway lifecycle surface (create, list, inspect, reload, diff, reconcile, service install/start/stop/status, and managed mode), see CLI Command Groups.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Usage
kt gateway run [OPTIONS]
Options
| Flag | Env Var | Default | Description |
|---|---|---|---|
--listen <host:port> | — | 0.0.0.0:41002 | Address to listen on |
--upstream <url> | KEEPTRUSTS_UPSTREAM_URL | https://api.keeptrusts.com in hosted mode | Debug-only upstream override for short-lived experiments; prefer config-defined providers |
--upstream-api-key <key> | KEEPTRUSTS_UPSTREAM_API_KEY | — | Debug-only credential override; prefer secret_key_ref.env or secret_key_ref.store in config |
--upstream-api-key-header <header> | KEEPTRUSTS_UPSTREAM_API_KEY_HEADER | Authorization | Header name for upstream credentials |
--upstream-api-key-prefix <prefix> | KEEPTRUSTS_UPSTREAM_API_KEY_PREFIX | Bearer | Prefix used when constructing the upstream auth header |
--policy-config <path> | — | — | Declarative config file to load; repeat the flag to layer multiple files |
--fail-mode <mode> | — | block | Behavior on policy error: block or allow |
--max-concurrency <n> | KEEPTRUSTS_GATEWAY_MAX_CONCURRENCY | 16 | Maximum concurrent upstream requests before adaptive backoff |
--api-token <token> | KEEPTRUSTS_GATEWAY_TOKEN | — | Runtime token used for gateway reporting and authenticated inspection |
| — | KEEPTRUSTS_API_URL | — | Keeptrusts API for gateway reporting, config resolution, and admin endpoints |
Examples
Basic — Config-defined provider
export OPENAI_API_KEY="sk-your-openai-key"
kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml
In the recommended workflow, policy-config.yaml declares the provider target under providers.targets[], and the gateway reads the credential through secret_key_ref.env or secret_key_ref.store.
With Event Reporting
export KEEPTRUSTS_API_URL="https://api.keeptrusts.com"
export KEEPTRUSTS_GATEWAY_TOKEN="kt_gw_your_gateway_token"
export OPENAI_API_KEY="sk-your-openai-key"
kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml
The current kt gateway run command reads KEEPTRUSTS_API_URL from the environment. It exposes --api-token for the runtime token, but it does not expose dedicated --api-url or legacy --api-key flags.
Treat --upstream and --upstream-api-key as break-glass debugging flags only. The standard workflow is to keep provider definitions in policy-config.yaml so the same document works for local, managed, and Git-backed deployments.
Fail-Open Mode
Use --fail-mode allow to let requests through even when policy evaluation fails (e.g., classifier unavailable):
export OPENAI_API_KEY="sk-your-openai-key"
kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml \
--fail-mode allow
Multi-Provider with Fallback
When your config defines multiple provider targets, the gateway automatically handles failover:
pack:
name: multi-provider
version: 0.1.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
- id: anthropic-fallback
provider: anthropic
model: claude-sonnet-4-20250514
base_url: https://api.anthropic.com
secret_key_ref:
env: ANTHROPIC_API_KEY
provider_type: anthropic
format: anthropic
fallback:
triggers:
- rate_limit
- server_error
- timeout
max_fallback_attempts: 3
routing:
strategy: ordered
Provider targets can also use secret_key_ref.store to reference a managed config variable instead of an environment variable:
pack:
name: proxy-run-providers-2
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o
base_url: https://api.openai.com
secret_key_ref:
store: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
The gateway resolves secret_key_ref values through the machine config-variable endpoint at startup. The gateway service token must have config_vars:resolve permission.
Nested model definitions
A single provider target can declare multiple models with per-model pricing:
pack:
name: proxy-run-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml
Sending Requests Through the Gateway
The gateway exposes an OpenAI-compatible API. Point your application at the gateway instead of the provider:
# Direct to gateway
curl http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Explain quantum computing"}]
}'
Provider and model pinning
Override the default routing strategy by sending request headers:
# Pin to a specific provider target
curl http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Keeptrusts-Provider: openai-primary" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
# Override the model within the selected provider
curl http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Keeptrusts-Provider: openai-primary" \
-H "X-Keeptrusts-Model: gpt-4o-mini" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
Unknown or unauthorized target/model values return 400 Bad Request. Pinning does not bypass policy evaluation.
from openai import OpenAI
# Point the client at the gateway
client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="sk-your-openai-key", # or gateway handles auth
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain quantum computing"}],
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:41002/v1",
apiKey: "sk-your-openai-key",
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Explain quantum computing" }],
});
console.log(response.choices[0].message.content);
Runtime Diagnostics
While the gateway is running, inspect its state:
# View running config (requires a scoped gateway token)
curl -H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" \
http://localhost:41002/keeptrusts/config | jq
# View provider metrics (latency, errors, request counts)
curl http://localhost:41002/keeptrusts/providers/metrics | jq
# Hot-reload configuration without restart (requires a scoped gateway token)
curl -X POST http://localhost:41002/keeptrusts/config/reload \
-H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" \
-H "Content-Type: application/json" \
-d @policy-config.yaml
MCP Runtime Modes
Keeptrusts now treats HTTP MCP as a first-class runtime rather than an adapter-only escape hatch.
- Use
provider: mcpwithbase_urlfor the native MCP bridge. - Add
adapter_commandonly when you intentionally want the legacy adapter-backed fallback path. - Tool governance is enforced at bridge time, not only after request translation, so MCP tool invocations are subject to the same allow, deny, schema, and semantic validation controls as the rest of the gateway runtime.
Operational guidance:
- Prefer the native bridge for long-lived operator deployments because it is the path surfaced in console telemetry and runtime diagnostics.
- Keep the adapter fallback for compatibility testing or when you need to wrap an existing local MCP runner binary.
- Treat native MCP and fallback MCP as distinct runtime modes during incident triage; they exercise different control surfaces even when they target the same upstream tool server.
Native bridge request flow:
- The gateway normalizes tool requests into JSON-RPC
tools/callframes and forwards them tobase_url + /mcpunlesspath_templateoverrides the bridge path. mcp.protocol_versionandmcp.session_idare emitted into the request metadata before dispatch so upstream MCP services can correlate the bridge session.- Tool allowlists, deny rules, schema checks, and tool-security analysis run before the HTTP bridge call. A blocked tool never leaves the gateway process.
Example native MCP target:
pack:
name: proxy-run-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: docs-mcp
provider: mcp
base_url: https://mcp.example.com
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Example adapter-backed fallback target:
pack:
name: proxy-run-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: local-mcp-adapter
provider: mcp
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Runtime Identity Modes
Keeptrusts supports three identity proof modes for requests passing through the gateway. All modes populate the same user_id field in event and trace records, but they differ in how that identity is established and what identity_proof_method value is recorded in the audit trail.
HeaderSoft mode (default)
X-User-ID and X-User-Role HTTP headers are accepted as identity hints from the upstream caller. This mode is suitable for deployments where the gateway sits behind a trusted reverse proxy that strips and re-sets identity headers before they reach the gateway. Identity is not cryptographically verified by the gateway itself.
# No additional configuration required. HeaderSoft is the default.
# For explicit declaration:
runtime_identity:
mode: header_soft
Use HeaderSoft when:
- The gateway is deployed behind a trusted reverse proxy (nginx, Envoy, AWS ALB) that enforces authentication upstream.
- The reverse proxy is hardened to strip incoming
X-User-IDheaders from external callers before re-setting them from its own session state. - Audit requirements do not mandate cryptographic proof of caller identity at the gateway boundary.
HeaderSoft identity is recorded as identity_proof_method: header_soft in event and trace records. Compliance reviewers can use this field to distinguish header-derived identity from gateway-verified identity.
GatewayKey mode
Identity is derived from the gateway key used to authenticate the request. The gateway resolves the key to its associated owner, team, and scope automatically before the policy chain runs. No additional headers are required and verification is performed by the auth layer before enforcement begins.
# Automatically active when a gateway key is presented for authentication.
# To require gateway-key authentication for all requests:
runtime_identity:
mode: gateway_key
require_verified_identity: true
Use GatewayKey when:
- Programmatic callers authenticate with gateway-issued API keys.
- You want key-scoped budget enforcement and identity traceability without a full PKI setup.
- Team or org isolation is required and key-scoped identity provides the right granularity.
GatewayKey identity is recorded as identity_proof_method: gateway_key in event and trace records.
SignedAssertion mode (future)
The caller presents a signed JWT or JWKS-verifiable token. The gateway validates the token against a configured JWKS endpoint or published verification key before persisting identity into audit records. This mode is planned for a future release.
# Future configuration shape:
runtime_identity:
mode: signed_assertion
jwks_url: https://auth.example.com/.well-known/jwks.json
require_verified_identity: true
require_verified_identity
Setting require_verified_identity: true causes the gateway to reject requests that cannot be resolved to a GatewayKey or SignedAssertion identity. Requests relying on HeaderSoft hints receive a 401 response when this flag is active.
runtime_identity:
require_verified_identity: true
This flag is safe to enable incrementally: set false first to instrument identity_proof_method coverage across your callers before moving to enforcement. Inspect the identity_proof_method field in event records to identify callers still relying on HeaderSoft before switching to enforcement mode.
Runtime Boundaries
kt gateway run can gateway OpenAI-compatible upstream providers and a growing set of adapter-backed runtimes. Provider families fall into three categories:
| Category | Examples | Behavior |
|---|---|---|
| Supported | openai, anthropic, exec:, file://, echo, anthropic:claude-agent-sdk, vercel | Loaded and served normally. |
| Supported with adapter | browser, chatkit, websocket | Require adapter_command in the provider config. Missing adapter_command is flagged by kt policy lint and logged as a warning at startup; requests to that target fail with 501. |
| Unsupported at config time | manual-input, sequence, simulated-user, go, ruby, webhook | Rejected when the config is loaded. The gateway will not start if any target uses one of these execution families. |
Use these rules when mapping provider configs into Keeptrusts:
- Use
file://...orexec:...for local scripts. Baregoandrubyprovider IDs are tracked in the catalog but are not executable by themselves. - Use adapter-backed targets such as
browser,websocket,openai:chatkit:<workflow>,anthropic:claude-agent-sdk, andopenai:agentswhen a local runner process exists. - Use
provider: mcpwithbase_urlfor the native HTTP MCP bridge. Addadapter_commandonly when you intentionally want the older adapter-backed MCP path. - Do not use bare
sequence,simulated-user,slack, orwebhookevaluator providers insidekt gateway run; they require human or external-system loops that Keeptrusts does not emulate inside the gateway process.
If you include an unsupported execution family in your config, kt gateway run will exit immediately with a diagnostic that names the target and suggests an alternative. Run kt policy lint --file your-config.yaml beforehand to catch these issues without starting the gateway.
Distributed Gateway Features
When compiled with --features distributed, the gateway supports Redis-backed coordination for multi-instance deployments.
Note: The official production Docker image is built with
--features distributedenabled. Redis, S3, GCS, and Qdrant backends are available at runtime without rebuilding the image.
Distributed rate limiting
Add a distributed_rate_limit block to your policy config to enforce shared rate limits across all gateway instances:
distributed_rate_limit:
backend: valkey
url_env: KEEPTRUSTS_REDIS_URL
backendshould bevalkey;redisis still accepted as a compatibility alias.url_envnames the environment variable that holds the Valkey/Redis-compatible connection URL.- On a limit violation, the gateway responds with
429 Too Many Requests, aRetry-Afterheader, andX-RateLimit-Remaining: 0. - If Valkey is unreachable, the gateway fails open — requests are not blocked — and logs a warning.
- Valkey keys are scoped to tenant, org, and gateway to prevent cross-organization collisions.
Distributed fingerprint deduplication
When distributed_rate_limit is configured and Redis is reachable, bot-fingerprint deduplication is automatically extended across all instances using Redis SET NX PX (atomic check-and-store with TTL). Each record includes an instance_id for traceability. If Redis is unavailable the fingerprint store falls back to an in-process HashMap (fail-open).
External embedding configuration
The embedding-detector policy supports an external backend that sends the normalized input to a configurable HTTP embedding endpoint:
policy:
embedding-detector:
backend: external
endpoint: https://embed.example.com/v1/embeddings
secret_key_ref:
env: EMBED_API_KEY
timeout_ms: 3000
model: text-embedding-3-small
threshold: 0.85
pack:
name: proxy-run-example-11
version: 1.0.0
enabled: true
policies:
chain:
- embedding-detector
The external backend is a feature-gated production integration. It enforces the same allow/block/escalate logic as the local backend.
Conversation Continuity
Pass the X-Conversation-ID header with requests to scope session continuity to a specific conversation thread. When present, memory recall and history entries are scoped to the conversation rather than the default daily time bucket.
curl -H "X-Conversation-ID: my-conv-123" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4","messages":[...]}' \
http://localhost:41002/v1/chat/completions
When X-Conversation-ID is absent, the gateway falls back to the default day-bucket session derivation for backward compatibility.
Memory Recall Performance
Memory recall runs all scope queries in parallel and caches results for 60 seconds. When the memory backend is degraded, the gateway enters a fail-fast mode (returning empty recall) after 3 consecutive failures, with a 10-second cooldown before retrying. This behavior is controlled by the fail_open setting in the loads configuration section.
Benchmark Smoke Commands
For local dataplane benchmark smoke runs, execute:
cd cli
cargo bench --bench gateway_routing_bench -- --noplot
cargo bench --bench gateway_cache_bench -- --noplot
These benches measure routing-selection overhead, retry/failover scheduling overhead, and cache hit lookup cost. They are the benchmark commands referenced by the Phase 9 readiness report and the benchmark specification.
For AI systems
- Canonical command:
kt gateway run [OPTIONS]. - Flags:
--listen,--upstream,--upstream-api-key,--policy-config,--fail-mode. - Environment variables:
KEEPTRUSTS_API_URL,KEEPTRUSTS_GATEWAY_TOKEN,KEEPTRUSTS_UPSTREAM_URL,KEEPTRUSTS_UPSTREAM_API_KEY,KEEPTRUSTS_UPSTREAM_API_KEY_HEADER,KEEPTRUSTS_UPSTREAM_API_KEY_PREFIX,KEEPTRUSTS_GATEWAY_MAX_CONCURRENCY. - Config sections:
providers.targets[],providers.fallback,providers.routing,providers.circuit_breaker,providers.health_check,distributed_rate_limit,runtime_identity. - Identity modes:
header_soft(default),gateway_key,signed_assertion(future). - MCP runtime modes: native bridge (
provider: mcp+base_url) and adapter fallback (adapter_command). - Headers:
X-Keeptrusts-Provider,X-Keeptrusts-Model,X-Conversation-ID. - Related pages: Managed Mode, Multi-Provider Fallback, Streaming & SSE, WebSocket Gateway.
For engineers
- Prerequisites: A valid
policy-config.yaml(passeskt policy lint), provider credentials as environment variables or config variables, and optionallyKEEPTRUSTS_API_URL/KEEPTRUSTS_GATEWAY_TOKENfor gateway reporting. - Validate: After starting, send a test request via
curl http://localhost:41002/v1/chat/completions. Check runtime config withcurl http://localhost:41002/keeptrusts/config | jq. - Hot-reload:
curl -X POST http://localhost:41002/keeptrusts/config/reload -H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" -H "Content-Type: application/json" -d @policy-config.yaml. - Troubleshooting: If the gateway exits immediately, check for unsupported provider families (run
kt policy lintfirst). If 503s occur, inspect circuit breaker state via/keeptrusts/providers/metrics. If--upstreamis set, remember it overrides config-defined providers. - Benchmarks:
cargo bench --bench gateway_routing_bench -- --noplotfor local performance validation.
For leaders
kt gateway runis the runtime enforcement surface — this is where policies are applied to live AI traffic.- Fail-mode selection (
blockvsallow) is a business decision: availability vs. enforcement strictness. - Runtime identity modes (HeaderSoft, GatewayKey) determine audit trail quality and tenant isolation guarantees — GatewayKey provides cryptographic proof of caller identity.
- Distributed features (Redis-backed rate limiting, fingerprint dedup) require infrastructure investment but enable multi-instance scaling.
- MCP governance extends tool-use controls to agent workflows — tool calls are subject to the same policy chain as LLM requests.
Next steps
- Managed Mode — Automatic config polling for production
- Multi-Provider Fallback — Failover strategies
- Streaming & SSE — Real-time streaming through the gateway
- WebSocket Gateway — Bidirectional WebSocket proxying
- kt policy lint — Validate config before starting
- CLI overview
- AI agents path
- Technical engineers path
- Technical leaders path