Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

kt gateway run

Start the Keeptrusts policy enforcement gateway. The gateway sits between your application and AI providers, enforcing policies on every request and response.

Use this page when

  • You need to start the Keeptrusts gateway locally with kt gateway run.
  • You are configuring provider targets, fallback, routing strategies, or runtime identity modes.
  • You need the exact flags, environment variables, or config patterns for the gateway runtime.

The recommended pattern is to define providers in policy-config.yaml and use kt gateway run to execute that contract.

For the broader gateway lifecycle surface (create, list, inspect, reload, diff, reconcile, service install/start/stop/status, and managed mode), see CLI Command Groups.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

Usage

kt gateway run [OPTIONS]

Options

FlagEnv VarDefaultDescription
--listen <host:port>0.0.0.0:41002Address to listen on
--upstream <url>KEEPTRUSTS_UPSTREAM_URLhttps://api.keeptrusts.com in hosted modeDebug-only upstream override for short-lived experiments; prefer config-defined providers
--upstream-api-key <key>KEEPTRUSTS_UPSTREAM_API_KEYDebug-only credential override; prefer secret_key_ref.env or secret_key_ref.store in config
--upstream-api-key-header <header>KEEPTRUSTS_UPSTREAM_API_KEY_HEADERAuthorizationHeader name for upstream credentials
--upstream-api-key-prefix <prefix>KEEPTRUSTS_UPSTREAM_API_KEY_PREFIXBearer Prefix used when constructing the upstream auth header
--policy-config <path>Declarative config file to load; repeat the flag to layer multiple files
--fail-mode <mode>blockBehavior on policy error: block or allow
--max-concurrency <n>KEEPTRUSTS_GATEWAY_MAX_CONCURRENCY16Maximum concurrent upstream requests before adaptive backoff
--api-token <token>KEEPTRUSTS_GATEWAY_TOKENRuntime token used for gateway reporting and authenticated inspection
KEEPTRUSTS_API_URLKeeptrusts API for gateway reporting, config resolution, and admin endpoints

Examples

Basic — Config-defined provider

export OPENAI_API_KEY="sk-your-openai-key"

kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml

In the recommended workflow, policy-config.yaml declares the provider target under providers.targets[], and the gateway reads the credential through secret_key_ref.env or secret_key_ref.store.

With Event Reporting

export KEEPTRUSTS_API_URL="https://api.keeptrusts.com"
export KEEPTRUSTS_GATEWAY_TOKEN="kt_gw_your_gateway_token"
export OPENAI_API_KEY="sk-your-openai-key"

kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml

The current kt gateway run command reads KEEPTRUSTS_API_URL from the environment. It exposes --api-token for the runtime token, but it does not expose dedicated --api-url or legacy --api-key flags.

Treat --upstream and --upstream-api-key as break-glass debugging flags only. The standard workflow is to keep provider definitions in policy-config.yaml so the same document works for local, managed, and Git-backed deployments.

Fail-Open Mode

Use --fail-mode allow to let requests through even when policy evaluation fails (e.g., classifier unavailable):

export OPENAI_API_KEY="sk-your-openai-key"

kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml \
--fail-mode allow
Fail-open mode means policy failures will not block requests. Use only when availability is more important than enforcement.

Multi-Provider with Fallback

When your config defines multiple provider targets, the gateway automatically handles failover:

pack:
name: multi-provider
version: 0.1.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
- id: anthropic-fallback
provider: anthropic
model: claude-sonnet-4-20250514
base_url: https://api.anthropic.com
secret_key_ref:
env: ANTHROPIC_API_KEY
provider_type: anthropic
format: anthropic
fallback:
triggers:
- rate_limit
- server_error
- timeout
max_fallback_attempts: 3
routing:
strategy: ordered

Provider targets can also use secret_key_ref.store to reference a managed config variable instead of an environment variable:

pack:
name: proxy-run-providers-2
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o
base_url: https://api.openai.com
secret_key_ref:
store: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

The gateway resolves secret_key_ref values through the machine config-variable endpoint at startup. The gateway service token must have config_vars:resolve permission.

Nested model definitions

A single provider target can declare multiple models with per-model pricing:

pack:
name: proxy-run-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml

Sending Requests Through the Gateway

The gateway exposes an OpenAI-compatible API. Point your application at the gateway instead of the provider:

# Direct to gateway
curl http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Explain quantum computing"}]
}'

Provider and model pinning

Override the default routing strategy by sending request headers:

# Pin to a specific provider target
curl http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Keeptrusts-Provider: openai-primary" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

# Override the model within the selected provider
curl http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Keeptrusts-Provider: openai-primary" \
-H "X-Keeptrusts-Model: gpt-4o-mini" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

Unknown or unauthorized target/model values return 400 Bad Request. Pinning does not bypass policy evaluation.

from openai import OpenAI

# Point the client at the gateway
client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="sk-your-openai-key", # or gateway handles auth
)

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain quantum computing"}],
)
print(response.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({
baseURL: "http://localhost:41002/v1",
apiKey: "sk-your-openai-key",
});

const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Explain quantum computing" }],
});
console.log(response.choices[0].message.content);

Runtime Diagnostics

While the gateway is running, inspect its state:

# View running config (requires a scoped gateway token)
curl -H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" \
http://localhost:41002/keeptrusts/config | jq

# View provider metrics (latency, errors, request counts)
curl http://localhost:41002/keeptrusts/providers/metrics | jq

# Hot-reload configuration without restart (requires a scoped gateway token)
curl -X POST http://localhost:41002/keeptrusts/config/reload \
-H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" \
-H "Content-Type: application/json" \
-d @policy-config.yaml

MCP Runtime Modes

Keeptrusts now treats HTTP MCP as a first-class runtime rather than an adapter-only escape hatch.

  • Use provider: mcp with base_url for the native MCP bridge.
  • Add adapter_command only when you intentionally want the legacy adapter-backed fallback path.
  • Tool governance is enforced at bridge time, not only after request translation, so MCP tool invocations are subject to the same allow, deny, schema, and semantic validation controls as the rest of the gateway runtime.

Operational guidance:

  • Prefer the native bridge for long-lived operator deployments because it is the path surfaced in console telemetry and runtime diagnostics.
  • Keep the adapter fallback for compatibility testing or when you need to wrap an existing local MCP runner binary.
  • Treat native MCP and fallback MCP as distinct runtime modes during incident triage; they exercise different control surfaces even when they target the same upstream tool server.

Native bridge request flow:

  • The gateway normalizes tool requests into JSON-RPC tools/call frames and forwards them to base_url + /mcp unless path_template overrides the bridge path.
  • mcp.protocol_version and mcp.session_id are emitted into the request metadata before dispatch so upstream MCP services can correlate the bridge session.
  • Tool allowlists, deny rules, schema checks, and tool-security analysis run before the HTTP bridge call. A blocked tool never leaves the gateway process.

Example native MCP target:

pack:
name: proxy-run-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: docs-mcp
provider: mcp
base_url: https://mcp.example.com
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Example adapter-backed fallback target:

pack:
name: proxy-run-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: local-mcp-adapter
provider: mcp
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Runtime Identity Modes

Keeptrusts supports three identity proof modes for requests passing through the gateway. All modes populate the same user_id field in event and trace records, but they differ in how that identity is established and what identity_proof_method value is recorded in the audit trail.

HeaderSoft mode (default)

X-User-ID and X-User-Role HTTP headers are accepted as identity hints from the upstream caller. This mode is suitable for deployments where the gateway sits behind a trusted reverse proxy that strips and re-sets identity headers before they reach the gateway. Identity is not cryptographically verified by the gateway itself.

# No additional configuration required. HeaderSoft is the default.
# For explicit declaration:
runtime_identity:
mode: header_soft

Use HeaderSoft when:

  • The gateway is deployed behind a trusted reverse proxy (nginx, Envoy, AWS ALB) that enforces authentication upstream.
  • The reverse proxy is hardened to strip incoming X-User-ID headers from external callers before re-setting them from its own session state.
  • Audit requirements do not mandate cryptographic proof of caller identity at the gateway boundary.

HeaderSoft identity is recorded as identity_proof_method: header_soft in event and trace records. Compliance reviewers can use this field to distinguish header-derived identity from gateway-verified identity.

GatewayKey mode

Identity is derived from the gateway key used to authenticate the request. The gateway resolves the key to its associated owner, team, and scope automatically before the policy chain runs. No additional headers are required and verification is performed by the auth layer before enforcement begins.

# Automatically active when a gateway key is presented for authentication.
# To require gateway-key authentication for all requests:
runtime_identity:
mode: gateway_key
require_verified_identity: true

Use GatewayKey when:

  • Programmatic callers authenticate with gateway-issued API keys.
  • You want key-scoped budget enforcement and identity traceability without a full PKI setup.
  • Team or org isolation is required and key-scoped identity provides the right granularity.

GatewayKey identity is recorded as identity_proof_method: gateway_key in event and trace records.

SignedAssertion mode (future)

The caller presents a signed JWT or JWKS-verifiable token. The gateway validates the token against a configured JWKS endpoint or published verification key before persisting identity into audit records. This mode is planned for a future release.

# Future configuration shape:
runtime_identity:
mode: signed_assertion
jwks_url: https://auth.example.com/.well-known/jwks.json
require_verified_identity: true

require_verified_identity

Setting require_verified_identity: true causes the gateway to reject requests that cannot be resolved to a GatewayKey or SignedAssertion identity. Requests relying on HeaderSoft hints receive a 401 response when this flag is active.

runtime_identity:
require_verified_identity: true

This flag is safe to enable incrementally: set false first to instrument identity_proof_method coverage across your callers before moving to enforcement. Inspect the identity_proof_method field in event records to identify callers still relying on HeaderSoft before switching to enforcement mode.

Runtime Boundaries

kt gateway run can gateway OpenAI-compatible upstream providers and a growing set of adapter-backed runtimes. Provider families fall into three categories:

CategoryExamplesBehavior
Supportedopenai, anthropic, exec:, file://, echo, anthropic:claude-agent-sdk, vercelLoaded and served normally.
Supported with adapterbrowser, chatkit, websocketRequire adapter_command in the provider config. Missing adapter_command is flagged by kt policy lint and logged as a warning at startup; requests to that target fail with 501.
Unsupported at config timemanual-input, sequence, simulated-user, go, ruby, webhookRejected when the config is loaded. The gateway will not start if any target uses one of these execution families.

Use these rules when mapping provider configs into Keeptrusts:

  • Use file://... or exec:... for local scripts. Bare go and ruby provider IDs are tracked in the catalog but are not executable by themselves.
  • Use adapter-backed targets such as browser, websocket, openai:chatkit:<workflow>, anthropic:claude-agent-sdk, and openai:agents when a local runner process exists.
  • Use provider: mcp with base_url for the native HTTP MCP bridge. Add adapter_command only when you intentionally want the older adapter-backed MCP path.
  • Do not use bare sequence, simulated-user, slack, or webhook evaluator providers inside kt gateway run; they require human or external-system loops that Keeptrusts does not emulate inside the gateway process.

If you include an unsupported execution family in your config, kt gateway run will exit immediately with a diagnostic that names the target and suggests an alternative. Run kt policy lint --file your-config.yaml beforehand to catch these issues without starting the gateway.

Distributed Gateway Features

When compiled with --features distributed, the gateway supports Redis-backed coordination for multi-instance deployments.

Note: The official production Docker image is built with --features distributed enabled. Redis, S3, GCS, and Qdrant backends are available at runtime without rebuilding the image.

Distributed rate limiting

Add a distributed_rate_limit block to your policy config to enforce shared rate limits across all gateway instances:

distributed_rate_limit:
backend: valkey
url_env: KEEPTRUSTS_REDIS_URL
  • backend should be valkey; redis is still accepted as a compatibility alias.
  • url_env names the environment variable that holds the Valkey/Redis-compatible connection URL.
  • On a limit violation, the gateway responds with 429 Too Many Requests, a Retry-After header, and X-RateLimit-Remaining: 0.
  • If Valkey is unreachable, the gateway fails open — requests are not blocked — and logs a warning.
  • Valkey keys are scoped to tenant, org, and gateway to prevent cross-organization collisions.

Distributed fingerprint deduplication

When distributed_rate_limit is configured and Redis is reachable, bot-fingerprint deduplication is automatically extended across all instances using Redis SET NX PX (atomic check-and-store with TTL). Each record includes an instance_id for traceability. If Redis is unavailable the fingerprint store falls back to an in-process HashMap (fail-open).

External embedding configuration

The embedding-detector policy supports an external backend that sends the normalized input to a configurable HTTP embedding endpoint:

policy:
embedding-detector:
backend: external
endpoint: https://embed.example.com/v1/embeddings
secret_key_ref:
env: EMBED_API_KEY
timeout_ms: 3000
model: text-embedding-3-small
threshold: 0.85
pack:
name: proxy-run-example-11
version: 1.0.0
enabled: true
policies:
chain:
- embedding-detector

The external backend is a feature-gated production integration. It enforces the same allow/block/escalate logic as the local backend.

Conversation Continuity

Pass the X-Conversation-ID header with requests to scope session continuity to a specific conversation thread. When present, memory recall and history entries are scoped to the conversation rather than the default daily time bucket.

curl -H "X-Conversation-ID: my-conv-123" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4","messages":[...]}' \
http://localhost:41002/v1/chat/completions

When X-Conversation-ID is absent, the gateway falls back to the default day-bucket session derivation for backward compatibility.

Memory Recall Performance

Memory recall runs all scope queries in parallel and caches results for 60 seconds. When the memory backend is degraded, the gateway enters a fail-fast mode (returning empty recall) after 3 consecutive failures, with a 10-second cooldown before retrying. This behavior is controlled by the fail_open setting in the loads configuration section.

Benchmark Smoke Commands

For local dataplane benchmark smoke runs, execute:

cd cli
cargo bench --bench gateway_routing_bench -- --noplot
cargo bench --bench gateway_cache_bench -- --noplot

These benches measure routing-selection overhead, retry/failover scheduling overhead, and cache hit lookup cost. They are the benchmark commands referenced by the Phase 9 readiness report and the benchmark specification.

For AI systems

  • Canonical command: kt gateway run [OPTIONS].
  • Flags: --listen, --upstream, --upstream-api-key, --policy-config, --fail-mode.
  • Environment variables: KEEPTRUSTS_API_URL, KEEPTRUSTS_GATEWAY_TOKEN, KEEPTRUSTS_UPSTREAM_URL, KEEPTRUSTS_UPSTREAM_API_KEY, KEEPTRUSTS_UPSTREAM_API_KEY_HEADER, KEEPTRUSTS_UPSTREAM_API_KEY_PREFIX, KEEPTRUSTS_GATEWAY_MAX_CONCURRENCY.
  • Config sections: providers.targets[], providers.fallback, providers.routing, providers.circuit_breaker, providers.health_check, distributed_rate_limit, runtime_identity.
  • Identity modes: header_soft (default), gateway_key, signed_assertion (future).
  • MCP runtime modes: native bridge (provider: mcp + base_url) and adapter fallback (adapter_command).
  • Headers: X-Keeptrusts-Provider, X-Keeptrusts-Model, X-Conversation-ID.
  • Related pages: Managed Mode, Multi-Provider Fallback, Streaming & SSE, WebSocket Gateway.

For engineers

  • Prerequisites: A valid policy-config.yaml (passes kt policy lint), provider credentials as environment variables or config variables, and optionally KEEPTRUSTS_API_URL/KEEPTRUSTS_GATEWAY_TOKEN for gateway reporting.
  • Validate: After starting, send a test request via curl http://localhost:41002/v1/chat/completions. Check runtime config with curl http://localhost:41002/keeptrusts/config | jq.
  • Hot-reload: curl -X POST http://localhost:41002/keeptrusts/config/reload -H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" -H "Content-Type: application/json" -d @policy-config.yaml.
  • Troubleshooting: If the gateway exits immediately, check for unsupported provider families (run kt policy lint first). If 503s occur, inspect circuit breaker state via /keeptrusts/providers/metrics. If --upstream is set, remember it overrides config-defined providers.
  • Benchmarks: cargo bench --bench gateway_routing_bench -- --noplot for local performance validation.

For leaders

  • kt gateway run is the runtime enforcement surface — this is where policies are applied to live AI traffic.
  • Fail-mode selection (block vs allow) is a business decision: availability vs. enforcement strictness.
  • Runtime identity modes (HeaderSoft, GatewayKey) determine audit trail quality and tenant isolation guarantees — GatewayKey provides cryptographic proof of caller identity.
  • Distributed features (Redis-backed rate limiting, fingerprint dedup) require infrastructure investment but enable multi-instance scaling.
  • MCP governance extends tool-use controls to agent workflows — tool calls are subject to the same policy chain as LLM requests.

Next steps