kt gateway run

Start the Keeptrusts policy enforcement gateway. The gateway sits between your application and AI providers, enforcing policies on every request and response.

Use this page when

You need to start the Keeptrusts gateway locally with kt gateway run.
You are configuring provider targets, fallback, routing strategies, or runtime identity modes.
You need the exact flags, environment variables, or config patterns for the gateway runtime.

The recommended pattern is to define providers in policy-config.yaml and use kt gateway run to execute that contract.

For the broader gateway lifecycle surface (create, list, inspect, reload, diff, reconcile, service install/start/stop/status, and managed mode), see CLI Command Groups.

Primary audience

Primary: AI Agents, Technical Engineers
Secondary: Technical Leaders

Usage

kt gateway run [OPTIONS]

Options

Flag	Env Var	Default	Description
`--listen <host:port>`	—	`0.0.0.0:41002`	Address to listen on
`--upstream <url>`	`KEEPTRUSTS_UPSTREAM_URL`	`https://api.keeptrusts.com` in hosted mode	Debug-only upstream override for short-lived experiments; prefer config-defined providers
`--upstream-api-key <key>`	`KEEPTRUSTS_UPSTREAM_API_KEY`	—	Debug-only credential override; prefer `secret_key_ref.env` or `secret_key_ref.store` in config
`--upstream-api-key-header <header>`	`KEEPTRUSTS_UPSTREAM_API_KEY_HEADER`	`Authorization`	Header name for upstream credentials
`--upstream-api-key-prefix <prefix>`	`KEEPTRUSTS_UPSTREAM_API_KEY_PREFIX`	`Bearer`	Prefix used when constructing the upstream auth header
`--policy-config <path>`	—	—	Declarative config file to load; repeat the flag to layer multiple files
`--fail-mode <mode>`	—	`block`	Behavior on policy error: `block` or `allow`
`--max-concurrency <n>`	`KEEPTRUSTS_GATEWAY_MAX_CONCURRENCY`	`16`	Maximum concurrent upstream requests before adaptive backoff
`--api-token <token>`	`KEEPTRUSTS_GATEWAY_TOKEN`	—	Runtime token used for gateway reporting and authenticated inspection
—	`KEEPTRUSTS_API_URL`	—	Keeptrusts API for gateway reporting, config resolution, and admin endpoints

Examples

Basic — Config-defined provider

export OPENAI_API_KEY="sk-your-openai-key"

kt gateway run \
  --listen 0.0.0.0:41002 \
  --policy-config policy-config.yaml

In the recommended workflow, policy-config.yaml declares the provider target under providers.targets[], and the gateway reads the credential through secret_key_ref.env or secret_key_ref.store.

With Event Reporting

export KEEPTRUSTS_API_URL="https://api.keeptrusts.com"
export KEEPTRUSTS_GATEWAY_TOKEN="kt_gw_your_gateway_token"
export OPENAI_API_KEY="sk-your-openai-key"

kt gateway run \
  --listen 0.0.0.0:41002 \
  --policy-config policy-config.yaml

The current kt gateway run command reads KEEPTRUSTS_API_URL from the environment. It exposes --api-token for the runtime token, but it does not expose dedicated --api-url or legacy --api-key flags.

Treat --upstream and --upstream-api-key as break-glass debugging flags only. The standard workflow is to keep provider definitions in policy-config.yaml so the same document works for local, managed, and Git-backed deployments.

Fail-Open Mode

Use --fail-mode allow to let requests through even when policy evaluation fails (e.g., classifier unavailable):

export OPENAI_API_KEY="sk-your-openai-key"

kt gateway run \
  --listen 0.0.0.0:41002 \
  --policy-config policy-config.yaml \
  --fail-mode allow

Fail-open mode means policy failures will not block requests. Use only when availability is more important than enforcement.

Multi-Provider with Fallback

When your config defines multiple provider targets, the gateway automatically handles failover:

pack:
  name: multi-provider
  version: 0.1.0
  enabled: true
policies:
  chain:
  - prompt-injection
  - pii-detector
providers:
  targets:
  - id: openai-primary
    provider: openai
    model: gpt-4o
    base_url: https://api.openai.com
    secret_key_ref:
      env: OPENAI_API_KEY
  - id: anthropic-fallback
    provider: anthropic
    model: claude-sonnet-4-20250514
    base_url: https://api.anthropic.com
    secret_key_ref:
      env: ANTHROPIC_API_KEY
    provider_type: anthropic
    format: anthropic
  fallback:
    triggers:
    - rate_limit
    - server_error
    - timeout
    max_fallback_attempts: 3
  routing:
    strategy: ordered

Provider targets can also use secret_key_ref.store to reference a managed config variable instead of an environment variable:

pack:
  name: proxy-run-providers-2
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: openai-primary
    provider: openai
    model: gpt-4o
    base_url: https://api.openai.com
    secret_key_ref:
      store: OPENAI_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

The gateway resolves secret_key_ref values through the machine config-variable endpoint at startup. The gateway service token must have config_vars:resolve permission.

Nested model definitions

A single provider target can declare multiple models with per-model pricing:

pack:
  name: proxy-run-providers-3
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: openai-primary
    provider: openai
    base_url: https://api.openai.com
    secret_key_ref:
      env: OPENAI_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

kt gateway run \
  --listen 0.0.0.0:41002 \
  --policy-config policy-config.yaml

Sending Requests Through the Gateway

The gateway exposes an OpenAI-compatible API. Point your application at the gateway instead of the provider:

# Direct to gateway
curl http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Explain quantum computing"}]
  }'

Provider and model pinning

Override the default routing strategy by sending request headers:

# Pin to a specific provider target
curl http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Keeptrusts-Provider: openai-primary" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

# Override the model within the selected provider
curl http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Keeptrusts-Provider: openai-primary" \
  -H "X-Keeptrusts-Model: gpt-4o-mini" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

Unknown or unauthorized target/model values return 400 Bad Request. Pinning does not bypass policy evaluation.

from openai import OpenAI

# Point the client at the gateway
client = OpenAI(
    base_url="http://localhost:41002/v1",
    api_key="sk-your-openai-key",  # or gateway handles auth
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
)
print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:41002/v1",
  apiKey: "sk-your-openai-key",
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Explain quantum computing" }],
});
console.log(response.choices[0].message.content);

Runtime Diagnostics

While the gateway is running, inspect its state:

# View running config (requires a scoped gateway token)
curl -H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" \
  http://localhost:41002/keeptrusts/config | jq

# View provider metrics (latency, errors, request counts)
curl http://localhost:41002/keeptrusts/providers/metrics | jq

# Hot-reload configuration without restart (requires a scoped gateway token)
curl -X POST http://localhost:41002/keeptrusts/config/reload \
  -H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -d @policy-config.yaml

MCP Runtime Modes

Keeptrusts now treats HTTP MCP as a first-class runtime rather than an adapter-only escape hatch.

Use provider: mcp with base_url for the native MCP bridge.
Add adapter_command only when you intentionally want the legacy adapter-backed fallback path.
Tool governance is enforced at bridge time, not only after request translation, so MCP tool invocations are subject to the same allow, deny, schema, and semantic validation controls as the rest of the gateway runtime.

Operational guidance:

Prefer the native bridge for long-lived operator deployments because it is the path surfaced in console telemetry and runtime diagnostics.
Keep the adapter fallback for compatibility testing or when you need to wrap an existing local MCP runner binary.
Treat native MCP and fallback MCP as distinct runtime modes during incident triage; they exercise different control surfaces even when they target the same upstream tool server.

Native bridge request flow:

The gateway normalizes tool requests into JSON-RPC tools/call frames and forwards them to base_url + /mcp unless path_template overrides the bridge path.
mcp.protocol_version and mcp.session_id are emitted into the request metadata before dispatch so upstream MCP services can correlate the bridge session.
Tool allowlists, deny rules, schema checks, and tool-security analysis run before the HTTP bridge call. A blocked tool never leaves the gateway process.

Example native MCP target:

pack:
  name: proxy-run-providers-4
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: docs-mcp
    provider: mcp
    base_url: https://mcp.example.com
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Example adapter-backed fallback target:

pack:
  name: proxy-run-providers-5
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: local-mcp-adapter
    provider: mcp
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Runtime Identity Modes

Keeptrusts supports three identity proof modes for requests passing through the gateway. All modes populate the same user_id field in event and trace records, but they differ in how that identity is established and what identity_proof_method value is recorded in the audit trail.

HeaderSoft mode (default)

X-User-ID and X-User-Role HTTP headers are accepted as identity hints from the upstream caller. This mode is suitable for deployments where the gateway sits behind a trusted reverse proxy that strips and re-sets identity headers before they reach the gateway. Identity is not cryptographically verified by the gateway itself.

# No additional configuration required. HeaderSoft is the default.
# For explicit declaration:
runtime_identity:
  mode: header_soft

Use HeaderSoft when:

The gateway is deployed behind a trusted reverse proxy (nginx, Envoy, AWS ALB) that enforces authentication upstream.
The reverse proxy is hardened to strip incoming X-User-ID headers from external callers before re-setting them from its own session state.
Audit requirements do not mandate cryptographic proof of caller identity at the gateway boundary.

HeaderSoft identity is recorded as identity_proof_method: header_soft in event and trace records. Compliance reviewers can use this field to distinguish header-derived identity from gateway-verified identity.

GatewayKey mode

Identity is derived from the gateway key used to authenticate the request. The gateway resolves the key to its associated owner, team, and scope automatically before the policy chain runs. No additional headers are required and verification is performed by the auth layer before enforcement begins.

# Automatically active when a gateway key is presented for authentication.
# To require gateway-key authentication for all requests:
runtime_identity:
  mode: gateway_key
  require_verified_identity: true

Use GatewayKey when:

Programmatic callers authenticate with gateway-issued API keys.
You want key-scoped budget enforcement and identity traceability without a full PKI setup.
Team or org isolation is required and key-scoped identity provides the right granularity.

GatewayKey identity is recorded as identity_proof_method: gateway_key in event and trace records.

SignedAssertion mode (future)

The caller presents a signed JWT or JWKS-verifiable token. The gateway validates the token against a configured JWKS endpoint or published verification key before persisting identity into audit records. This mode is planned for a future release.

# Future configuration shape:
runtime_identity:
  mode: signed_assertion
  jwks_url: https://auth.example.com/.well-known/jwks.json
  require_verified_identity: true

require_verified_identity

Setting require_verified_identity: true causes the gateway to reject requests that cannot be resolved to a GatewayKey or SignedAssertion identity. Requests relying on HeaderSoft hints receive a 401 response when this flag is active.

runtime_identity:
  require_verified_identity: true

This flag is safe to enable incrementally: set false first to instrument identity_proof_method coverage across your callers before moving to enforcement. Inspect the identity_proof_method field in event records to identify callers still relying on HeaderSoft before switching to enforcement mode.

Runtime Boundaries

kt gateway run can gateway OpenAI-compatible upstream providers and a growing set of adapter-backed runtimes. Provider families fall into three categories:

Category	Examples	Behavior
Supported	`openai`, `anthropic`, `exec:`, `file://`, `echo`, `anthropic:claude-agent-sdk`, `vercel`	Loaded and served normally.
Supported with adapter	`browser`, `chatkit`, `websocket`	Require `adapter_command` in the provider config. Missing `adapter_command` is flagged by `kt policy lint` and logged as a warning at startup; requests to that target fail with `501`.
Unsupported at config time	`manual-input`, `sequence`, `simulated-user`, `go`, `ruby`, `webhook`	Rejected when the config is loaded. The gateway will not start if any target uses one of these execution families.

Use these rules when mapping provider configs into Keeptrusts:

Use file://... or exec:... for local scripts. Bare go and ruby provider IDs are tracked in the catalog but are not executable by themselves.
Use adapter-backed targets such as browser, websocket, openai:chatkit:<workflow>, anthropic:claude-agent-sdk, and openai:agents when a local runner process exists.
Use provider: mcp with base_url for the native HTTP MCP bridge. Add adapter_command only when you intentionally want the older adapter-backed MCP path.
Do not use bare sequence, simulated-user, slack, or webhook evaluator providers inside kt gateway run; they require human or external-system loops that Keeptrusts does not emulate inside the gateway process.

If you include an unsupported execution family in your config, kt gateway run will exit immediately with a diagnostic that names the target and suggests an alternative. Run kt policy lint --file your-config.yaml beforehand to catch these issues without starting the gateway.

Distributed Gateway Features

When compiled with --features distributed, the gateway supports Redis-backed coordination for multi-instance deployments.

Note: The official production Docker image is built with --features distributed enabled. Redis, S3, GCS, and Qdrant backends are available at runtime without rebuilding the image.

Distributed rate limiting

Add a distributed_rate_limit block to your policy config to enforce shared rate limits across all gateway instances:

distributed_rate_limit:
  backend: valkey
  url_env: KEEPTRUSTS_REDIS_URL

backend should be valkey; redis is still accepted as a compatibility alias.
url_env names the environment variable that holds the Valkey/Redis-compatible connection URL.
On a limit violation, the gateway responds with 429 Too Many Requests, a Retry-After header, and X-RateLimit-Remaining: 0.
If Valkey is unreachable, the gateway fails open — requests are not blocked — and logs a warning.
Valkey keys are scoped to tenant, org, and gateway to prevent cross-organization collisions.

Distributed fingerprint deduplication

When distributed_rate_limit is configured and Redis is reachable, bot-fingerprint deduplication is automatically extended across all instances using Redis SET NX PX (atomic check-and-store with TTL). Each record includes an instance_id for traceability. If Redis is unavailable the fingerprint store falls back to an in-process HashMap (fail-open).

External embedding configuration

The embedding-detector policy supports an external backend that sends the normalized input to a configurable HTTP embedding endpoint:

policy:
  embedding-detector:
    backend: external
    endpoint: https://embed.example.com/v1/embeddings
    secret_key_ref:
      env: EMBED_API_KEY
    timeout_ms: 3000
    model: text-embedding-3-small
    threshold: 0.85
pack:
  name: proxy-run-example-11
  version: 1.0.0
  enabled: true
policies:
  chain:
  - embedding-detector

The external backend is a feature-gated production integration. It enforces the same allow/block/escalate logic as the local backend.

Conversation Continuity

Pass the X-Conversation-ID header with requests to scope session continuity to a specific conversation thread. When present, memory recall and history entries are scoped to the conversation rather than the default daily time bucket.

curl -H "X-Conversation-ID: my-conv-123" \
     -H "Content-Type: application/json" \
     -d '{"model":"gpt-4","messages":[...]}' \
     http://localhost:41002/v1/chat/completions

When X-Conversation-ID is absent, the gateway falls back to the default day-bucket session derivation for backward compatibility.

Memory Recall Performance

Memory recall runs all scope queries in parallel and caches results for 60 seconds. When the memory backend is degraded, the gateway enters a fail-fast mode (returning empty recall) after 3 consecutive failures, with a 10-second cooldown before retrying. This behavior is controlled by the fail_open setting in the loads configuration section.

Benchmark Smoke Commands

For local dataplane benchmark smoke runs, execute:

cd cli
cargo bench --bench gateway_routing_bench -- --noplot
cargo bench --bench gateway_cache_bench -- --noplot

These benches measure routing-selection overhead, retry/failover scheduling overhead, and cache hit lookup cost. They are the benchmark commands referenced by the Phase 9 readiness report and the benchmark specification.

For AI systems

Canonical command: kt gateway run [OPTIONS].
Flags: --listen, --upstream, --upstream-api-key, --policy-config, --fail-mode.
Environment variables: KEEPTRUSTS_API_URL, KEEPTRUSTS_GATEWAY_TOKEN, KEEPTRUSTS_UPSTREAM_URL, KEEPTRUSTS_UPSTREAM_API_KEY, KEEPTRUSTS_UPSTREAM_API_KEY_HEADER, KEEPTRUSTS_UPSTREAM_API_KEY_PREFIX, KEEPTRUSTS_GATEWAY_MAX_CONCURRENCY.
Config sections: providers.targets[], providers.fallback, providers.routing, providers.circuit_breaker, providers.health_check, distributed_rate_limit, runtime_identity.
Identity modes: header_soft (default), gateway_key, signed_assertion (future).
MCP runtime modes: native bridge (provider: mcp + base_url) and adapter fallback (adapter_command).
Headers: X-Keeptrusts-Provider, X-Keeptrusts-Model, X-Conversation-ID.
Related pages: Managed Mode, Multi-Provider Fallback, Streaming & SSE, WebSocket Gateway.

For engineers

Prerequisites: A valid policy-config.yaml (passes kt policy lint), provider credentials as environment variables or config variables, and optionally KEEPTRUSTS_API_URL/KEEPTRUSTS_GATEWAY_TOKEN for gateway reporting.
Validate: After starting, send a test request via curl http://localhost:41002/v1/chat/completions. Check runtime config with curl http://localhost:41002/keeptrusts/config | jq.
Hot-reload: curl -X POST http://localhost:41002/keeptrusts/config/reload -H "Authorization: Bearer $KEEPTRUSTS_GATEWAY_TOKEN" -H "Content-Type: application/json" -d @policy-config.yaml.
Troubleshooting: If the gateway exits immediately, check for unsupported provider families (run kt policy lint first). If 503s occur, inspect circuit breaker state via /keeptrusts/providers/metrics. If --upstream is set, remember it overrides config-defined providers.
Benchmarks: cargo bench --bench gateway_routing_bench -- --noplot for local performance validation.

For leaders

kt gateway run is the runtime enforcement surface — this is where policies are applied to live AI traffic.
Fail-mode selection (block vs allow) is a business decision: availability vs. enforcement strictness.
Runtime identity modes (HeaderSoft, GatewayKey) determine audit trail quality and tenant isolation guarantees — GatewayKey provides cryptographic proof of caller identity.
Distributed features (Redis-backed rate limiting, fingerprint dedup) require infrastructure investment but enable multi-instance scaling.
MCP governance extends tool-use controls to agent workflows — tool calls are subject to the same policy chain as LLM requests.

Next steps

Managed Mode — Automatic config polling for production
Multi-Provider Fallback — Failover strategies
Streaming & SSE — Real-time streaming through the gateway
WebSocket Gateway — Bidirectional WebSocket proxying
kt policy lint — Validate config before starting
CLI overview
AI agents path
Technical engineers path
Technical leaders path

Use this page when​

Primary audience​

Usage​

Options​

Examples​

Basic — Config-defined provider​

With Event Reporting​

Fail-Open Mode​

Multi-Provider with Fallback​

Nested model definitions​

Sending Requests Through the Gateway​

Provider and model pinning​

Runtime Diagnostics​

MCP Runtime Modes​

Runtime Identity Modes​

HeaderSoft mode (default)​

GatewayKey mode​

SignedAssertion mode (future)​

require_verified_identity​

Runtime Boundaries​

Distributed Gateway Features​

Distributed rate limiting​

Distributed fingerprint deduplication​

External embedding configuration​

Conversation Continuity​

Memory Recall Performance​

Benchmark Smoke Commands​

For AI systems​

For engineers​

For leaders​

Next steps​