Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Providers Configuration

The providers: section of your policy config declares upstream LLM provider targets, routing strategy, fallback behavior, data handling policies, and advanced features like circuit breakers, A/B testing, and traffic mirroring.

Use this page when

  • You are configuring upstream LLM provider targets, routing strategies, or fallback behavior in policy-config.yaml.
  • You need to set up multi-provider routing, circuit breakers, A/B testing, or traffic mirroring.
  • You are declaring data handling policies, token pricing, or health probes for your provider fleet.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

Quick reference

pack:
name: config-providers-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: primary
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Provider targets

Each target describes a single upstream endpoint. The gateway evaluates targets in order when using the ordered routing strategy.

Minimal target

pack:
name: config-providers-providers-2
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Only id and provider are required. When base_url and secret_key_ref are omitted, the gateway infers defaults from the provider alias (see Supported Providers).

Full target reference

pack:
name: config-providers-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
model: gpt-4o
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Data policies

Declare the data handling guarantees your provider contract includes. The data-routing-policy policy uses these to filter targets.

pack:
name: config-providers-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-zdr
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
- id: anthropic-standard
provider: anthropic
model: claude-sonnet-4-20250514
secret_key_ref:
env: ANTHROPIC_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Token pricing

Declare pricing for spend tracking, cost assertions, and the auto provider scoring.

pack:
name: config-providers-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-4o
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Nested models

A single target can serve multiple models. Each model can have its own aliases, pricing, and escalation routing.

pack:
name: config-providers-providers-6
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Escalation routing override

Route escalations to specific teams or users per provider or model.

pack:
name: config-providers-providers-7
version: 1.0.0
enabled: true
providers:
targets:
- id: healthcare-openai
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Health probes

Active health probes run in the background and mark providers unhealthy on consecutive failures.

pack:
name: config-providers-providers-8
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-prod
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

OAuth2 authentication

For providers requiring OAuth2 bearer tokens (Databricks, Snowflake Cortex, custom endpoints).

pack:
name: config-providers-providers-9
version: 1.0.0
enabled: true
providers:
targets:
- id: databricks-prod
provider: databricks
model: databricks-meta-llama-3-70b-instruct
base_url: https://my-workspace.databricks.net/serving-endpoints
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Cloud-specific fields

pack:
name: config-providers-providers-10
version: 1.0.0
enabled: true
providers:
targets:
- id: azure-gpt4
provider: azure-openai
model: gpt-4o
base_url: https://my-resource.openai.azure.com
secret_key_ref:
env: AZURE_OPENAI_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Routing strategies

The routing.strategy field controls how the gateway selects a target for each request.

providers:
routing:
strategy: ordered
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
StrategyDescription
orderedTry targets in declaration order; first healthy target wins
round_robinCycle through targets evenly
weighted_round_robinCycle weighted by each target's weight field
randomRandom selection
simple_shuffleShuffle then iterate
lowest_latencySelect target with lowest observed P50/P90/P99 latency
highest_throughputSelect target with highest observed throughput
least_connectionsSelect target with fewest in-flight requests
least_busySelect target with lowest current load
usage_basedSelect target with lowest cumulative usage
semanticMatch request embedding against target descriptions

Latency / throughput preferences

providers:
routing:
strategy: lowest_latency
window_seconds: 300
min_sample_count: 10
exploration_ratio: 0.1
preferred_max_latency:
value: 500
percentile: p90
preferred_min_throughput:
value: 50
percentile: p50
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Cost ceiling

providers:
routing:
strategy: ordered
max_price:
prompt: 0.01
completion: 0.05
request: 0.1
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Region and quantization filtering

providers:
routing:
strategy: ordered
require_region: eu
require_quantizations:
- fp16
- int8
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Target filtering

providers:
routing:
strategy: round_robin
only:
- openai-prod
- anthropic-prod
ignore:
- openai-dev
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Fallback behavior

The fallback: section controls automatic retry on provider errors.

providers:
fallback:
enabled: true
triggers:
- rate_limit
- server_error
- timeout
- context_length_exceeded
- content_filter
- model_not_found
max_fallback_attempts: 3
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Content filter fallback

When a provider's content filter rejects a response, switch to an alternative model.

providers:
fallback:
triggers:
- content_filter
content_policy:
replacement_model: gpt-4o-mini
custom_response_template: The content filter was triggered. Using fallback.
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Context window overflow

providers:
fallback:
triggers:
- context_length_exceeded
context_window:
overflow_strategy: truncate
overflow_model: gpt-4o
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Context compression

Reduce token usage by compressing long conversations before sending upstream.

providers:
context_compression:
enabled: true
strategy: "middle_out" # middle_out | oldest_first
preserve_system_message: true
preserve_first_n: 2
preserve_last_n: 4
max_messages: 20
message_compression_strategy: "halves"

Zero completion insurance

Automatically handle zero-token completions from providers.

providers:
zero_completion_insurance:
enabled: true
conditions: ["empty_content", "finish_reason_length"]
action: "retry" # suppress_billing | retry | log_only
retry_with_fallback: true

Model groups

Group targets into logical model groups with aliases and fallback chains.

providers:
model_groups:
- name: "fast-models"
aliases: ["fast", "quick"]
description: "Low-latency models for real-time use"
targets: ["groq-llama", "openai-4o-mini"]
fallback_group: "standard-models"
- name: "standard-models"
aliases: ["standard", "default"]
targets: ["openai-4o", "anthropic-claude"]

Provider pipelines

Orchestrate multiple provider targets behind one virtual model name. Use mode: sequence when one model's output should feed the next step, or mode: fan_out when you want multiple models to answer the same request and combine the successful outputs.

pack:
name: config-providers-providers-25
version: 1.0.0
enabled: true
providers:
targets:
- id: writer
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY
- id: reviewer
provider: anthropic
model: claude-sonnet-4-20250514
secret_key_ref:
env: ANTHROPIC_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Pipeline rules:

  • Pipelines are invoked through the normal request model field, just like model_groups.
  • Each step must reference a target with an explicit providers.targets[].model. If you want to orchestrate different models from the same provider, declare separate targets for them.
  • Sequence steps after the first can append the previous step output to the original request (append) or replace the original request with it (replace).
  • fan_out runs every step against the original request and then either concatenates successful outputs in step order or returns the first successful output in config order.
  • Pipelines currently support POST /v1/chat/completions and POST /v1/responses. They are not supported for embeddings, moderations, or the legacy completions endpoint.
  • Direct request pinning with X-Keeptrusts-Provider or X-Keeptrusts-Model is intentionally rejected for pipeline models.

Circuit breaker

Automatically remove unhealthy targets from the pool.

providers:
circuit_breaker:
enabled: true
consecutive_failure_threshold: 5
cooldown_seconds: 30
half_open_successes: 1

Retry policy

Configure retry behavior with backoff.

providers:
retry_policy:
max_retries: 3
per_trigger:
rate_limit: 5
timeout: 2
backoff:
strategy: "exponential" # exponential | fixed
base_ms: 500

Scope-based rate limits

Apply rate limits per API key, user, team, or globally on the provider level.

providers:
scope_rate_limits:
per_key:
rpm: 60
tpm: 100000
per_user:
rpm: 30
tpm: 50000
per_team:
rpm: 120
global:
rpm: 500
max_parallel_requests: 20

Traffic mirroring

Shadow traffic to a secondary provider for comparison without affecting the response.

providers:
traffic_mirror:
enabled: true
mirror_target: anthropic-staging
sample_rate: 0.1
log_mirror_response: true
timeout_ms: 5000
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

A/B testing

Split traffic between provider variants.

providers:
ab_test:
enabled: true
sticky_by: user_id
variants:
- provider_id: openai-4o
weight: 0.7
- provider_id: anthropic-claude
weight: 0.3
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Logging controls

providers:
logging:
redact_message_bodies: true # strip request/response bodies from logs
redact_api_keys: true # default: true

Supported providers

The gateway ships with 100+ provider aliases. You only need provider and optionally secret_key_ref — the gateway infers base_url, api_key_header, and path_template from the alias.

ProviderAliasesAuth
OpenAIopenaiOPENAI_API_KEY
Anthropicanthropic, claudeANTHROPIC_API_KEY
Google AI Studiogoogle, geminiGOOGLE_AI_STUDIO_KEY
Google Vertex AIgoogle-vertex, vertexService account / OAuth2
Azure OpenAIazure-openai, azureAZURE_OPENAI_KEY
AWS Bedrockaws-bedrock, bedrockSigV4 (IAM role / env)
GroqgroqGROQ_API_KEY
MistralmistralMISTRAL_API_KEY
DeepSeekdeepseekDEEPSEEK_API_KEY
Together AItogetherai, togetherTOGETHER_API_KEY
Fireworks AIfireworks, fireworks-aiFIREWORKS_API_KEY
CerebrascerebrasCEREBRAS_API_KEY
PerplexityperplexityPERPLEXITY_API_KEY
OpenRouteropenrouterOPENROUTER_API_KEY
GitHub Modelsgithub, github-modelsGITHUB_TOKEN
CoherecohereCOHERE_API_KEY
HuggingFacehuggingface, hfHF_TOKEN
ReplicatereplicateREPLICATE_API_TOKEN
DatabricksdatabricksOAuth2 / DATABRICKS_TOKEN
Snowflake Cortexsnowflake-cortexOAuth2
Cloudflare AIcloudflare-aiCF_AI_TOKEN
Cloudflare AI Gatewaycloudflare-gatewayCF_AIG_TOKEN
OllamaollamaOptional (OLLAMA_API_KEY)
vLLMvllmOptional
LM StudiolmstudioOptional
llama.cppllama-cpp, llamaNone
Alibaba / Qwenalibaba, qwen, dashscopeDASHSCOPE_API_KEY
SambaNovasambanovaSAMBANOVA_API_KEY
xAI (Grok)xai, grokXAI_API_KEY
Docker Model Runnerdocker, docker-model-runnerOptional
Vercel AI Gatewayvercel, vercel-aiVERCEL_AI_GATEWAY_API_KEY

Wire format translation

The gateway translates between wire formats automatically. Set format explicitly when the gateway cannot auto-detect:

FormatProviders
openaiOpenAI, Azure, Groq, Mistral, Together, Fireworks, most OpenAI-compatible
anthropicAnthropic, Claude
cohereCohere
huggingfaceHuggingFace Inference Endpoints
replicateReplicate
watsonxIBM watsonx
google-geminiGoogle Gemini (AI Studio and Vertex)

Complete multi-provider example

pack:
name: production-multi-provider
version: 1.0.0
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
- id: anthropic-fallback
provider: anthropic
model: claude-sonnet-4-20250514
secret_key_ref:
env: ANTHROPIC_API_KEY
routing:
strategy: ordered
max_price:
request: 0.5
fallback:
triggers:
- rate_limit
- server_error
- timeout
max_fallback_attempts: 2
circuit_breaker:
enabled: true
consecutive_failure_threshold: 5
cooldown_seconds: 30
retry_policy:
max_retries: 3
backoff:
strategy: exponential
base_ms: 500
policies:
chain:
- prompt-injection
- pii-detector
- quality-scorer
policy:
prompt-injection:
response:
action: block
pii-detector:
action: redact
quality-scorer:
thresholds:
min_aggregate: 0.8

For AI systems

  • Canonical terms: Keeptrusts, providers, targets, routing strategy, fallback, circuit breaker, data_policy, secret_key_ref, health_probe, auto_provider
  • Config/command names: providers: section, providers.targets[], providers.routing.strategy, providers.fallback, data_policy.zero_data_retention, secret_key_ref.env, health_probe, oauth2, pricing
  • Best next pages: Data Routing Policy, Config Rate Limits, Routes and Consumer Groups, Declarative Config Reference

For engineers

  • Prerequisites: A valid policy-config.yaml with a pack: section; at least one upstream provider API key stored in an environment variable.
  • Validation: Run kt policy lint --file policy-config.yaml to validate provider targets. Start the gateway with kt gateway run --policy-config policy-config.yaml and verify with curl http://localhost:8080/keeptrusts/config | jq .providers.
  • Key commands: kt policy lint, kt gateway run, curl /keeptrusts/config

For leaders

  • Governance: Provider configuration determines which LLM vendors receive your organization's data. Review data_policy declarations to ensure they align with data processing agreements and contractual ZDR guarantees.
  • Cost: Token pricing fields (pricing.input_price_per_million, pricing.output_price_per_million) drive spend tracking and cost assertions. Inaccurate pricing leads to incorrect budget reporting.
  • Rollout: Start with a single primary provider and add fallbacks incrementally. Use routing.strategy: ordered for deterministic rollout before experimenting with weighted or auto strategies.

Next steps