OpenRouter
OpenRouter is a unified API gateway that aggregates 100+ language models from OpenAI, Anthropic, Google, Meta, Mistral, and many others under a single endpoint with cross-provider routing, automatic fallback, and transparent per-token pricing. Keeptrusts wraps OpenRouter with a policy enforcement layer so you can apply consistent prompt injection detection, PII redaction, and audit logging regardless of which underlying provider handles a given request — and use max_price to prevent runaway spend when OpenRouter's load balancer routes to a more expensive variant.
Use this page when
- You need the exact command, config, API, or integration details for OpenRouter.
- You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
- If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Prerequisites
- An OpenRouter API key (
OPENROUTER_API_KEY) ktCLI installed and authenticated (kt auth login)
Set your key before starting the gateway:
export OPENROUTER_API_KEY="sk-or-v1-..."
Configuration
Minimal — single model target
pack:
name: openrouter-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: openrouter-gpt4o
provider: openrouter:chat:openai/gpt-4o
secret_key_ref:
env: OPENROUTER_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Full governance config with cost controls
pack:
name: openrouter-governed
version: 1.0.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
- content-filter
- cost-guard
- audit-logger
policy:
pii-detector:
action: redact
entities:
- PERSON
- EMAIL_ADDRESS
- PHONE_NUMBER
- CREDIT_CARD
- US_SSN
content-filter:
categories:
- hate_speech
- harassment
- self_harm
action: block
cost-guard:
max_tokens_per_request: 8192
max_cost_per_request_usd: 0.5
audit-logger:
destination: api
include_request: true
include_response: true
include_policy_decisions: true
providers:
targets:
- id: openrouter-gpt4o
provider: openrouter:chat:openai/gpt-4o
secret_key_ref:
env: OPENROUTER_API_KEY
- id: openrouter-claude-opus
provider: openrouter:chat:anthropic/claude-opus-4-5
secret_key_ref:
env: OPENROUTER_API_KEY
- id: openrouter-llama
provider: openrouter:chat:meta-llama/llama-3.3-70b-instruct
secret_key_ref:
env: OPENROUTER_API_KEY
- id: openrouter-o3
provider: openrouter:chat:openai/o3
secret_key_ref:
env: OPENROUTER_API_KEY
Provider Fields
| Field | Required | Description |
|---|---|---|
provider | Yes | "openrouter" or "openrouter:chat:{provider/model}" |
secret_key_ref | Yes | Environment variable holding the OpenRouter API key (e.g. OPENROUTER_API_KEY) |
base_url | No | Defaults to https://openrouter.ai/api/v1 — override only for proxied deployments |
model | No | Full provider/model path when using the bare "openrouter" provider |
format | No | "openai" (OpenRouter exposes a fully OpenAI-compatible endpoint) |
max_price.input | No | Maximum acceptable input price in USD per 1M tokens; request fails if the routed provider charges more |
max_price.output | No | Maximum acceptable output price in USD per 1M tokens |
data_policy.training_opt_out | No | true — instructs OpenRouter to pass the X-OpenRouter-No-Prompt-Training header to all upstream providers that support it |
Supported Models
OpenRouter aggregates models from dozens of providers. Below are commonly used models with their typical routing costs. Prices shown are OpenRouter's published rates and may vary based on provider availability and load.
| Model Path | Context | Input (per 1M) | Output (per 1M) | Notes |
|---|---|---|---|---|
openai/gpt-4o | 128k | $2.50 | $10.00 | Flagship OpenAI multimodal; strong reasoning |
openai/gpt-4o-mini | 128k | $0.15 | $0.60 | Cost-efficient; recommended for high-volume |
openai/o3 | 200k | $10.00 | $40.00 | Advanced reasoning; billed per thinking token |
anthropic/claude-opus-4-5 | 200k | $15.00 | $75.00 | Anthropic flagship; strongest for analysis |
anthropic/claude-sonnet-4-5 | 200k | $3.00 | $15.00 | Balanced Anthropic model |
google/gemini-2.0-flash | 1M | $0.10 | $0.40 | Very fast multimodal; massive context |
meta-llama/llama-3.3-70b-instruct | 131k | $0.12 | $0.30 | Best open-weight value via OpenRouter |
mistralai/mistral-large | 128k | $2.00 | $6.00 | EU-hosted; strong multilingual |
deepseek/deepseek-r1 | 64k | $0.55 | $2.19 | Strong reasoning; low cost |
cohere/command-r-plus | 128k | $2.50 | $10.00 | Enterprise retrieval-optimised |
Visit openrouter.ai/models for the full live catalog with real-time pricing.
Client Examples
Start the gateway:
export OPENROUTER_API_KEY="sk-or-v1-..."
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
- Python
- Node.js
- cURL
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="unused", # auth handled by Keeptrusts
)
# GPT-4o via OpenRouter
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[
{"role": "system", "content": "You are an expert technical writer."},
{"role": "user", "content": "Summarise the key differences between SOC 2 Type I and Type II."},
],
max_tokens=1024,
temperature=0.3,
)
print(response.choices[0].message.content)
# Compare Claude Opus on the same prompt
claude_response = client.chat.completions.create(
model="anthropic/claude-opus-4-5",
messages=[
{"role": "system", "content": "You are an expert technical writer."},
{"role": "user", "content": "Summarise the key differences between SOC 2 Type I and Type II."},
],
max_tokens=1024,
temperature=0.3,
)
print(claude_response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:41002/v1",
apiKey: "unused",
});
// GPT-4o via OpenRouter
const response = await client.chat.completions.create({
model: "openai/gpt-4o",
messages: [
{ role: "system", content: "You are an expert technical writer." },
{
role: "user",
content: "Summarise the key differences between SOC 2 Type I and Type II.",
},
],
max_tokens: 1024,
temperature: 0.3,
});
console.log(response.choices[0].message.content);
// Llama 70B for cost-efficient tasks
const llama = await client.chat.completions.create({
model: "meta-llama/llama-3.3-70b-instruct",
messages: [
{ role: "user", content: "List five common causes of slow SQL queries." },
],
max_tokens: 512,
});
console.log(llama.choices[0].message.content);
# GPT-4o via OpenRouter
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "system", "content": "You are an expert technical writer."},
{"role": "user", "content": "Summarise the key differences between SOC 2 Type I and Type II."}
],
"max_tokens": 1024,
"temperature": 0.3
}' | jq .choices[0].message.content
# Llama 70B for cost-efficient tasks
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [
{"role": "user", "content": "List five common causes of slow SQL queries."}
],
"max_tokens": 512
}' | jq .choices[0].message.content
Streaming
OpenRouter supports streaming for all models that support it upstream. Keeptrusts passes SSE streams through after enforcing request-level policies.
from openai import OpenAI
client = OpenAI(base_url="http://localhost:41002/v1", api_key="unused")
with client.chat.completions.stream(
model="anthropic/claude-opus-4-5",
messages=[
{
"role": "user",
"content": "Write a detailed technical design document for a multi-tenant event ingestion pipeline.",
}
],
max_tokens=4096,
) as stream:
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
print()
Note that some OpenRouter providers do not support streaming; OpenRouter will return a non-streaming response in those cases. The Keeptrusts gateway normalises this transparently.
Advanced Configuration
Multi-model routing with cost tiers
OpenRouter's strength is its breadth. Use Keeptrusts's routing policy to assign different model tiers to different use cases — keeping costs predictable while giving power users access to frontier models:
policies:
chain:
- prompt-injection
- pii-detector
- router
- audit-logger
policy:
router:
rules:
- when_role: analyst
target: openrouter-gpt4o
- when_role: developer
target: openrouter-llama
- when_role: researcher
target: openrouter-claude-opus
- default:
target: openrouter-gpt4o-mini
providers:
targets:
- id: openrouter-claude-opus
provider: openrouter:chat:anthropic/claude-opus-4-5
secret_key_ref:
env: OPENROUTER_API_KEY
- id: openrouter-gpt4o
provider: openrouter:chat:openai/gpt-4o
secret_key_ref:
env: OPENROUTER_API_KEY
- id: openrouter-gpt4o-mini
provider: openrouter:chat:openai/gpt-4o-mini
secret_key_ref:
env: OPENROUTER_API_KEY
- id: openrouter-llama
provider: openrouter:chat:meta-llama/llama-3.3-70b-instruct
secret_key_ref:
env: OPENROUTER_API_KEY
Training opt-out enforcement
OpenRouter passes X-OpenRouter-No-Prompt-Training to upstream providers that support the header. Set training_opt_out: true on each target to ensure your data is not used to fine-tune provider models:
pack:
name: openrouter-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: openrouter-gpt4o
provider: openrouter:chat:openai/gpt-4o
secret_key_ref:
env: OPENROUTER_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Rate limiting and budget enforcement
OpenRouter accounts support credit limits, but Keeptrusts adds a per-user rate limiting layer so individual users cannot exhaust your monthly OpenRouter budget:
policies:
chain:
- rate-limiter
- prompt-injection
- pii-detector
- audit-logger
policy:
rate-limiter:
window: "1h"
per_user:
max_requests: 100
max_tokens: 200000
per_organization:
max_requests: 5000
max_tokens: 10000000
action_on_exceeded: "block"
Best Practices
-
Always set
max_priceon every OpenRouter target — OpenRouter's load balancer can route to a more expensive provider variant when the primary is unavailable. Without amax_pricecap, a single request can cost significantly more than expected. Set bounds that match your cost model for each target. -
Prefer explicit
provider/modelpaths over auto-routing — OpenRouter supportsmodel: "auto"which picks the cheapest available model. While convenient for exploration,"auto"is not deterministic and breaks reproducibility requirements. Pin to explicit paths in production configs. -
Enable
training_opt_out: trueon all targets — Many OpenRouter upstreams respect the no-training header when present. This is a low-cost mitigation for data governance without requiring per-provider agreements. -
Apply identical prompt injection detection regardless of model — A policy bypass that works on one model may not work on another but could be discovered via low-cost model probing through OpenRouter. Ensure
prompt-injectionis the first policy in the chain for every OpenRouter target. -
Use Keeptrusts audit logs to track per-model cost — OpenRouter provides usage in the response. Keeptrusts audit records capture this and let you build per-user, per-model cost attribution dashboards without depending on OpenRouter's reporting UI.
-
Test fallback chains with low-cost models first — When configuring failover between targets, test your fallback logic with cheap models (
gpt-4o-mini,llama-3.3-70b) before enabling failover to frontier models. A misconfigured chain that always falls back toclaude-opus-4-5will be expensive to discover in production.
For AI systems
- Canonical terms: Keeptrusts gateway, OpenRouter, multi-provider aggregation, model routing, provider target, policy-config.yaml,
provider: "openrouter", OPENROUTER_API_KEY. - Config field names:
provider,model,base_url: "https://openrouter.ai/api/v1",secret_key_ref.env: "OPENROUTER_API_KEY",format: "openai",pricing,weight. - Provider shorthand:
openrouter:chat:<model>(e.g.,openrouter:chat:openai/gpt-4o). - Key behavior: OpenRouter aggregates models from multiple providers (OpenAI, Anthropic, Meta, etc.) under one API key. Keeptrusts adds policy enforcement on top.
- Best next pages: AIML API integration (alternative aggregator), Provider routing, Policy configuration.
For engineers
- Prerequisites: OpenRouter API key (
OPENROUTER_API_KEYenv var from openrouter.ai/keys),ktCLI installed. - Start command:
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml. - Validate:
curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"openai/gpt-4o","messages":[{"role":"user","content":"hello"}]}'. - Model IDs use provider-prefixed format (e.g.,
openai/gpt-4o,anthropic/claude-sonnet-4-20250514,meta-llama/llama-3.3-70b-instruct). - Test fallback chains with low-cost models first (
gpt-4o-mini,meta-llama/llama-3.3-70b-instruct) before enabling failover to frontier models. - OpenRouter uses OpenAI-compatible API — standard OpenAI SDKs work without modification.
For leaders
- OpenRouter provides single-API-key access to multiple providers — simplifies vendor management but creates aggregator dependency.
- Cost varies significantly by model and routing — populate
pricingfields for accurate cost tracking per target. - Switching between models requires only a model ID change — enables rapid cost/quality experimentation without contract changes.
- A misconfigured fallback chain that always routes to frontier models (e.g., Claude Opus) can be expensive — test with budget models first.
Next steps
- AIML API integration — alternative model aggregation endpoint
- OpenAI integration — direct OpenAI access without aggregation layer
- Provider routing strategies — Keeptrusts-native multi-provider routing
- Policy configuration — prompt-injection and audit-logger reference
- Quickstart — install
ktand run your first gateway