Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Rate Limits Configuration

Keeptrusts supports five independent rate limiting scopes, each configured as a top-level section in your policy config. All scopes can optionally use a distributed Redis/Valkey backend for multi-instance coordination.

Use this page when

  • You are configuring request, IP, user, or token rate limits for your Keeptrusts gateway.
  • You need distributed rate limiting across multiple gateway instances using Redis or Valkey.
  • You are tuning size limits for request bodies, headers, or response payloads.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

Quick reference

global_rate_limit:
max_requests: 1000
window_seconds: 60

ip_rate_limit:
max_requests: 100
window_seconds: 60

user_rate_limit:
max_requests: 30
window_seconds: 60
header_names: ["x-user-id"]

token_rate_limit:
max_tokens: 500000
window_seconds: 3600
scope: "global"

size_limits:
max_body_bytes: 1048576
max_response_bytes: 10485760

Global rate limit

A single counter for all requests to this gateway instance.

global_rate_limit:
max_requests: 1000 # required
window_seconds: 60 # required
FieldTypeRequiredDefaultDescription
max_requestsintegeryesMaximum requests per window
window_secondsintegeryesFixed window size in seconds

Runtime behavior: Atomic counter with epoch-based fixed window reset. Returns HTTP 429 Too Many Requests with Retry-After header when exceeded.

Per-IP rate limit

Independent counters per client IP address.

ip_rate_limit:
max_requests: 100 # required
window_seconds: 60 # required
trust_proxy_depth: 1 # optional, default: 0
FieldTypeRequiredDefaultDescription
max_requestsintegeryesMaximum requests per IP per window
window_secondsintegeryesWindow size in seconds
trust_proxy_depthintegerno0Number of X-Forwarded-For hops to trust. 0 uses the direct connection IP

Behind a reverse proxy

When running behind nginx or a load balancer, set trust_proxy_depth to the number of trusted proxies in the chain:

# Gateway behind one nginx reverse proxy
ip_rate_limit:
max_requests: 50
window_seconds: 60
trust_proxy_depth: 1

Per-user rate limit

Independent counters per user identity extracted from request headers.

user_rate_limit:
max_requests: 30 # required
window_seconds: 60 # required
header_names: # optional
- "x-user-id"
- "x-consumer-id"
FieldTypeRequiredDefaultDescription
max_requestsintegeryesMaximum requests per user per window
window_secondsintegeryesWindow size in seconds
header_namesstring[]no["x-user-id"]Headers to extract user identity from. First non-empty value wins

Requests without a matching header are bucketed as unknown and share a single counter.

Token rate limit

Sliding-window limit on total LLM token consumption (prompt + completion).

token_rate_limit:
max_tokens: 500000 # required
window_seconds: 3600 # required
scope: "global" # optional: global | per_key | per_ip
FieldTypeRequiredDefaultDescription
max_tokensintegeryesMaximum tokens consumed per window
window_secondsintegeryesSliding window size in seconds
scopestringno"global"Bucketing scope: global, per_key, or per_ip

Runtime behavior: Uses a 6-sub-window sliding window. Tokens are recorded after the upstream response (from usage.total_tokens). The next request is checked against the remaining budget. Exceeding returns HTTP 429.

Scope examples

# Global: single bucket for all traffic
token_rate_limit:
max_tokens: 1000000
window_seconds: 3600
scope: "global"

# Per API key: each key gets its own budget
token_rate_limit:
max_tokens: 100000
window_seconds: 3600
scope: "per_key"

# Per IP: each client IP gets its own budget
token_rate_limit:
max_tokens: 50000
window_seconds: 3600
scope: "per_ip"

Size limits

Byte-level limits on request and response payloads.

size_limits:
max_body_bytes: 1048576 # 1 MB request body
max_header_bytes: 8192 # 8 KB headers
max_url_bytes: 4096 # 4 KB URL
max_response_bytes: 10485760 # 10 MB response
FieldTypeRequiredDefaultDescription
max_body_bytesintegernounlimitedMaximum request body size
max_header_bytesintegernounlimitedMaximum total header size
max_url_bytesintegernounlimitedMaximum URL length
max_response_bytesintegernounlimitedMaximum response body size

Exceeding any limit returns HTTP 413 Payload Too Large. Limits are checked in order: body → headers → URL → response.

Consumer group overrides can relax or tighten these per-key:

consumer_groups:
groups:
- name: "enterprise"
api_keys: ["sha256:abc123..."]
# size overrides applied via runtime API (future)

Distributed rate limiting

By default, rate limit counters are per-process (in-memory). For multi-instance deployments, use a shared Redis or Valkey backend.

Inline configuration

distributed_rate_limit:
backend: "redis"
url_env: "REDIS_URL"
FieldTypeRequiredDefaultDescription
backendstringyesredis or valkey (both use Redis wire protocol)
url_envstringno"REDIS_URL"Environment variable containing the connection URL

Nested under any rate limit section

You can also embed the distributed: block under any rate limit section:

global_rate_limit:
max_requests: 1000
window_seconds: 60
distributed:
backend: "valkey"
url_env: "KEEPTRUSTS_REDIS_URL"

The gateway checks these JSON paths in priority order:

  1. /distributed_rate_limit (top-level)
  2. /global_rate_limit/distributed
  3. /ip_rate_limit/distributed
  4. /token_rate_limit/distributed

Hosted gateway automatic configuration

In hosted gateway mode, the distributed backend is auto-configured from KEEPTRUSTS_REDIS_URL or REDIS_URL environment variables without needing the YAML section.

Complete rate limiting example

pack:
name: "rate-limited-gateway"
version: "1.0.0"
enabled: true

# Global ceiling
global_rate_limit:
max_requests: 5000
window_seconds: 60

# Per-IP protection
ip_rate_limit:
max_requests: 100
window_seconds: 60
trust_proxy_depth: 1

# Per-user fairness
user_rate_limit:
max_requests: 30
window_seconds: 60
header_names: ["x-user-id", "x-consumer-id"]

# Token budget
token_rate_limit:
max_tokens: 1000000
window_seconds: 3600
scope: "global"

# Request size protection
size_limits:
max_body_bytes: 2097152 # 2 MB
max_response_bytes: 20971520 # 20 MB

# Multi-instance coordination
distributed_rate_limit:
backend: "redis"
url_env: "REDIS_URL"

providers:
targets:
- id: "openai-prod"
provider: "openai"
model: "gpt-4o"
secret_key_ref:
env: "OPENAI_API_KEY"

policies:
chain:
- "audit-logger"

For AI systems

  • Canonical terms: Keeptrusts, global_rate_limit, ip_rate_limit, user_rate_limit, token_rate_limit, size_limits, distributed_rate_limit, window_seconds, max_requests, max_tokens
  • Config/command names: global_rate_limit, ip_rate_limit, user_rate_limit, token_rate_limit, size_limits, distributed_rate_limit, trust_proxy_depth, scope (global/per_key/per_ip)
  • Best next pages: Routes and Consumer Groups, Providers Configuration, Security and Network Configuration

For engineers

  • Prerequisites: A policy-config.yaml file. For distributed rate limiting, a Redis or Valkey instance accessible from all gateway instances.
  • Validation: Lint with kt policy lint --file policy-config.yaml. Send requests exceeding the configured limit and verify HTTP 429 responses with Retry-After headers. For distributed mode, confirm Redis connectivity at startup in gateway logs.
  • Key commands: kt policy lint, kt gateway run, curl -w '%{http_code}'

For leaders

  • Governance: Rate limits protect upstream provider budgets and prevent abuse. Token rate limits directly bound per-user spend. Set limits based on contracted API capacity and fair-use policies.
  • Cost: Without rate limits, a single misbehaving client can exhaust your entire provider quota. Token limits at per_key scope give each team a bounded budget.
  • Rollout: Start with generous global limits, monitor via Events, then tighten per-IP and per-user limits based on observed traffic patterns.

Next steps