Providers Configuration
The providers: section of your policy config declares upstream LLM provider targets, routing strategy, fallback behavior, data handling policies, and advanced features like circuit breakers, A/B testing, and traffic mirroring.
Use this page when
- You are configuring upstream LLM provider targets, routing strategies, or fallback behavior in
policy-config.yaml. - You need to set up multi-provider routing, circuit breakers, A/B testing, or traffic mirroring.
- You are declaring data handling policies, token pricing, or health probes for your provider fleet.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Quick reference
pack:
name: config-providers-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: primary
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Provider targets
Each target describes a single upstream endpoint. The gateway evaluates targets in order when using the ordered routing strategy.
Minimal target
pack:
name: config-providers-providers-2
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Only id and provider are required. When base_url and secret_key_ref are omitted, the gateway infers defaults from the provider alias (see Supported Providers).
Full target reference
pack:
name: config-providers-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
model: gpt-4o
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Data policies
Declare the data handling guarantees your provider contract includes. The data-routing-policy policy uses these to filter targets.
pack:
name: config-providers-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-zdr
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
- id: anthropic-standard
provider: anthropic
model: claude-sonnet-4-20250514
secret_key_ref:
env: ANTHROPIC_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Token pricing
Declare pricing for spend tracking, cost assertions, and the auto provider scoring.
pack:
name: config-providers-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-4o
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Nested models
A single target can serve multiple models. Each model can have its own aliases, pricing, and escalation routing.
pack:
name: config-providers-providers-6
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Escalation routing override
Route escalations to specific teams or users per provider or model.
pack:
name: config-providers-providers-7
version: 1.0.0
enabled: true
providers:
targets:
- id: healthcare-openai
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Health probes
Active health probes run in the background and mark providers unhealthy on consecutive failures.
pack:
name: config-providers-providers-8
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
OAuth2 authentication
For providers requiring OAuth2 bearer tokens (Databricks, Snowflake Cortex, custom endpoints).
pack:
name: config-providers-providers-9
version: 1.0.0
enabled: true
providers:
targets:
- id: databricks-prod
provider: databricks
model: databricks-meta-llama-3-70b-instruct
base_url: https://my-workspace.databricks.net/serving-endpoints
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Cloud-specific fields
- Azure OpenAI
- AWS Bedrock
- Google Vertex AI
- Anthropic
pack:
name: config-providers-providers-10
version: 1.0.0
enabled: true
providers:
targets:
- id: azure-gpt4
provider: azure-openai
model: gpt-4o
base_url: https://my-resource.openai.azure.com
secret_key_ref:
env: AZURE_OPENAI_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
pack:
name: config-providers-providers-11
version: 1.0.0
enabled: true
providers:
targets:
- id: bedrock-claude
provider: aws-bedrock
model: anthropic.claude-3-sonnet-20240229-v1:0
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
pack:
name: config-providers-providers-12
version: 1.0.0
enabled: true
providers:
targets:
- id: vertex-gemini
provider: google-vertex
model: gemini-1.5-pro
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
pack:
name: config-providers-providers-13
version: 1.0.0
enabled: true
providers:
targets:
- id: anthropic-claude
provider: anthropic
model: claude-sonnet-4-20250514
base_url: https://api.anthropic.com
secret_key_ref:
env: ANTHROPIC_API_KEY
provider_type: anthropic
format: anthropic
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Routing strategies
The routing.strategy field controls how the gateway selects a target for each request.
providers:
routing:
strategy: ordered
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
| Strategy | Description |
|---|---|
ordered | Try targets in declaration order; first healthy target wins |
round_robin | Cycle through targets evenly |
weighted_round_robin | Cycle weighted by each target's weight field |
random | Random selection |
simple_shuffle | Shuffle then iterate |
lowest_latency | Select target with lowest observed P50/P90/P99 latency |
highest_throughput | Select target with highest observed throughput |
least_connections | Select target with fewest in-flight requests |
least_busy | Select target with lowest current load |
usage_based | Select target with lowest cumulative usage |
semantic | Match request embedding against target descriptions |
Latency / throughput preferences
providers:
routing:
strategy: lowest_latency
window_seconds: 300
min_sample_count: 10
exploration_ratio: 0.1
preferred_max_latency:
value: 500
percentile: p90
preferred_min_throughput:
value: 50
percentile: p50
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Cost ceiling
providers:
routing:
strategy: ordered
max_price:
prompt: 0.01
completion: 0.05
request: 0.1
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Region and quantization filtering
providers:
routing:
strategy: ordered
require_region: eu
require_quantizations:
- fp16
- int8
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Target filtering
providers:
routing:
strategy: round_robin
only:
- openai-prod
- anthropic-prod
ignore:
- openai-dev
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Fallback behavior
The fallback: section controls automatic retry on provider errors.
providers:
fallback:
enabled: true
triggers:
- rate_limit
- server_error
- timeout
- context_length_exceeded
- content_filter
- model_not_found
max_fallback_attempts: 3
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Content filter fallback
When a provider's content filter rejects a response, switch to an alternative model.
providers:
fallback:
triggers:
- content_filter
content_policy:
replacement_model: gpt-4o-mini
custom_response_template: The content filter was triggered. Using fallback.
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Context window overflow
providers:
fallback:
triggers:
- context_length_exceeded
context_window:
overflow_strategy: truncate
overflow_model: gpt-4o
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Context compression
Reduce token usage by compressing long conversations before sending upstream.
providers:
context_compression:
enabled: true
strategy: "middle_out" # middle_out | oldest_first
preserve_system_message: true
preserve_first_n: 2
preserve_last_n: 4
max_messages: 20
message_compression_strategy: "halves"
Zero completion insurance
Automatically handle zero-token completions from providers.
providers:
zero_completion_insurance:
enabled: true
conditions: ["empty_content", "finish_reason_length"]
action: "retry" # suppress_billing | retry | log_only
retry_with_fallback: true
Model groups
Group targets into logical model groups with aliases and fallback chains.
providers:
model_groups:
- name: "fast-models"
aliases: ["fast", "quick"]
description: "Low-latency models for real-time use"
targets: ["groq-llama", "openai-4o-mini"]
fallback_group: "standard-models"
- name: "standard-models"
aliases: ["standard", "default"]
targets: ["openai-4o", "anthropic-claude"]
Provider pipelines
Orchestrate multiple provider targets behind one virtual model name. Use mode: sequence when one model's output should feed the next step, or mode: fan_out when you want multiple models to answer the same request and combine the successful outputs.
pack:
name: config-providers-providers-25
version: 1.0.0
enabled: true
providers:
targets:
- id: writer
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
- id: reviewer
provider: anthropic
model: claude-sonnet-4-20250514
secret_key_ref:
env: ANTHROPIC_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Pipeline rules:
- Pipelines are invoked through the normal request
modelfield, just likemodel_groups. - Each step must reference a target with an explicit
providers.targets[].model. If you want to orchestrate different models from the same provider, declare separate targets for them. - Sequence steps after the first can append the previous step output to the original request (
append) or replace the original request with it (replace). fan_outruns every step against the original request and then either concatenates successful outputs in step order or returns the first successful output in config order.- Pipelines currently support
POST /v1/chat/completionsandPOST /v1/responses. They are not supported for embeddings, moderations, or the legacy completions endpoint. - Direct request pinning with
X-Keeptrusts-ProviderorX-Keeptrusts-Modelis intentionally rejected for pipeline models.
Circuit breaker
Automatically remove unhealthy targets from the pool.
providers:
circuit_breaker:
enabled: true
consecutive_failure_threshold: 5
cooldown_seconds: 30
half_open_successes: 1
Retry policy
Configure retry behavior with backoff.
providers:
retry_policy:
max_retries: 3
per_trigger:
rate_limit: 5
timeout: 2
backoff:
strategy: "exponential" # exponential | fixed
base_ms: 500
Scope-based rate limits
Apply rate limits per API key, user, team, or globally on the provider level.
providers:
scope_rate_limits:
per_key:
rpm: 60
tpm: 100000
per_user:
rpm: 30
tpm: 50000
per_team:
rpm: 120
global:
rpm: 500
max_parallel_requests: 20
Traffic mirroring
Shadow traffic to a secondary provider for comparison without affecting the response.
providers:
traffic_mirror:
enabled: true
mirror_target: anthropic-staging
sample_rate: 0.1
log_mirror_response: true
timeout_ms: 5000
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
A/B testing
Split traffic between provider variants.
providers:
ab_test:
enabled: true
sticky_by: user_id
variants:
- provider_id: openai-4o
weight: 0.7
- provider_id: anthropic-claude
weight: 0.3
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
Logging controls
providers:
logging:
redact_message_bodies: true # strip request/response bodies from logs
redact_api_keys: true # default: true
Supported providers
The gateway ships with 100+ provider aliases. You only need provider and optionally secret_key_ref — the gateway infers base_url, api_key_header, and path_template from the alias.
| Provider | Aliases | Auth |
|---|---|---|
| OpenAI | openai | OPENAI_API_KEY |
| Anthropic | anthropic, claude | ANTHROPIC_API_KEY |
| Google AI Studio | google, gemini | GOOGLE_AI_STUDIO_KEY |
| Google Vertex AI | google-vertex, vertex | Service account / OAuth2 |
| Azure OpenAI | azure-openai, azure | AZURE_OPENAI_KEY |
| AWS Bedrock | aws-bedrock, bedrock | SigV4 (IAM role / env) |
| Groq | groq | GROQ_API_KEY |
| Mistral | mistral | MISTRAL_API_KEY |
| DeepSeek | deepseek | DEEPSEEK_API_KEY |
| Together AI | togetherai, together | TOGETHER_API_KEY |
| Fireworks AI | fireworks, fireworks-ai | FIREWORKS_API_KEY |
| Cerebras | cerebras | CEREBRAS_API_KEY |
| Perplexity | perplexity | PERPLEXITY_API_KEY |
| OpenRouter | openrouter | OPENROUTER_API_KEY |
| GitHub Models | github, github-models | GITHUB_TOKEN |
| Cohere | cohere | COHERE_API_KEY |
| HuggingFace | huggingface, hf | HF_TOKEN |
| Replicate | replicate | REPLICATE_API_TOKEN |
| Databricks | databricks | OAuth2 / DATABRICKS_TOKEN |
| Snowflake Cortex | snowflake-cortex | OAuth2 |
| Cloudflare AI | cloudflare-ai | CF_AI_TOKEN |
| Cloudflare AI Gateway | cloudflare-gateway | CF_AIG_TOKEN |
| Ollama | ollama | Optional (OLLAMA_API_KEY) |
| vLLM | vllm | Optional |
| LM Studio | lmstudio | Optional |
| llama.cpp | llama-cpp, llama | None |
| Alibaba / Qwen | alibaba, qwen, dashscope | DASHSCOPE_API_KEY |
| SambaNova | sambanova | SAMBANOVA_API_KEY |
| xAI (Grok) | xai, grok | XAI_API_KEY |
| Docker Model Runner | docker, docker-model-runner | Optional |
| Vercel AI Gateway | vercel, vercel-ai | VERCEL_AI_GATEWAY_API_KEY |
Wire format translation
The gateway translates between wire formats automatically. Set format explicitly when the gateway cannot auto-detect:
| Format | Providers |
|---|---|
openai | OpenAI, Azure, Groq, Mistral, Together, Fireworks, most OpenAI-compatible |
anthropic | Anthropic, Claude |
cohere | Cohere |
huggingface | HuggingFace Inference Endpoints |
replicate | Replicate |
watsonx | IBM watsonx |
google-gemini | Google Gemini (AI Studio and Vertex) |
Complete multi-provider example
pack:
name: production-multi-provider
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
- id: anthropic-fallback
provider: anthropic
model: claude-sonnet-4-20250514
secret_key_ref:
env: ANTHROPIC_API_KEY
routing:
strategy: ordered
max_price:
request: 0.5
fallback:
triggers:
- rate_limit
- server_error
- timeout
max_fallback_attempts: 2
circuit_breaker:
enabled: true
consecutive_failure_threshold: 5
cooldown_seconds: 30
retry_policy:
max_retries: 3
backoff:
strategy: exponential
base_ms: 500
policies:
chain:
- prompt-injection
- pii-detector
- quality-scorer
policy:
prompt-injection:
response:
action: block
pii-detector:
action: redact
quality-scorer:
thresholds:
min_aggregate: 0.8
For AI systems
- Canonical terms: Keeptrusts, providers, targets, routing strategy, fallback, circuit breaker, data_policy, secret_key_ref, health_probe, auto_provider
- Config/command names:
providers:section,providers.targets[],providers.routing.strategy,providers.fallback,data_policy.zero_data_retention,secret_key_ref.env,health_probe,oauth2,pricing - Best next pages: Data Routing Policy, Config Rate Limits, Routes and Consumer Groups, Declarative Config Reference
For engineers
- Prerequisites: A valid
policy-config.yamlwith apack:section; at least one upstream provider API key stored in an environment variable. - Validation: Run
kt policy lint --file policy-config.yamlto validate provider targets. Start the gateway withkt gateway run --policy-config policy-config.yamland verify withcurl http://localhost:8080/keeptrusts/config | jq .providers. - Key commands:
kt policy lint,kt gateway run,curl /keeptrusts/config
For leaders
- Governance: Provider configuration determines which LLM vendors receive your organization's data. Review
data_policydeclarations to ensure they align with data processing agreements and contractual ZDR guarantees. - Cost: Token pricing fields (
pricing.input_price_per_million,pricing.output_price_per_million) drive spend tracking and cost assertions. Inaccurate pricing leads to incorrect budget reporting. - Rollout: Start with a single primary provider and add fallbacks incrementally. Use
routing.strategy: orderedfor deterministic rollout before experimenting with weighted or auto strategies.
Next steps
- Declarative Config Reference — Document shapes and validation rules
- Data Routing Policy — Route by retention and training metadata
- Config Rate Limits — Request and token rate limiting
- Routes and Consumer Groups — Path-based routing and consumer overrides
- Config Runtime — Knowledge Base, history, and auto-provider settings