Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Cloudflare AI Gateway

Cloudflare AI Gateway is a caching, rate limiting, and observability layer for AI traffic that sits at Cloudflare's global edge network. It supports routing to OpenAI, Anthropic, Groq, Workers AI, Azure OpenAI, and many other providers through a single unified endpoint, and adds built-in analytics, request logging, and rate controls.

Use this page when

  • You need the exact command, config, API, or integration details for Cloudflare AI Gateway.
  • You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
  • If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.

Keeptrusts adds policy enforcement and compliance governance on top of Cloudflare AI Gateway. By placing Keeptrusts in front of the Cloudflare gateway, you get Keeptrusts's prompt-injection detection, PII redaction, content safety filters, and audit logging applied before requests reach Cloudflare — giving you a two-layer observability and governance stack.

Keeptrusts performs gateway-specific URL derivation: given your cloudflare_account_id, cloudflare_gateway_id, and the gateway provider sub-path (e.g. openai, workers-ai, anthropic), Keeptrusts derives the correct https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/{provider} URL automatically.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

Prerequisites

  1. Cloudflare account with AI Gateway enabled — access via the Cloudflare dashboard.
  2. Cloudflare API token with AI Gateway permissions, and your Account ID and Gateway ID from the Cloudflare dashboard.
  3. Keeptrusts CLI — install kt (quickstart guide).
  4. Export your credentials:
export CLOUDFLARE_ACCOUNT_ID="your-cloudflare-account-id"
export CLOUDFLARE_GATEWAY_ID="your-gateway-id"
export CLOUDFLARE_API_TOKEN="your-cloudflare-api-token"

When cloudflare_account_id_env and cloudflare_gateway_id_env are set, Keeptrusts derives the full gateway URL automatically. You do not need to set base_url manually unless you want to override it.

Configuration

A complete policy-config.yaml that routes traffic through Cloudflare AI Gateway (OpenAI backend) with prompt-injection, PII, and safety policies:

pack:
name: cloudflare-via-gateway
version: 1.0.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
- safety-filter
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
safety-filter:
mode: strict
action: block
audit-logger:
retention_days: 365
providers:
strategy: single
targets:
- id: cf-gateway-openai
provider: cloudflare-gateway:openai:gpt-4o
secret_key_ref:
env: CLOUDFLARE_API_TOKEN

Start the gateway:

kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml

Provider Shorthand Syntax

The provider field encodes both the gateway provider sub-path and the model:

cloudflare-gateway:<gateway-provider>:<model>

Examples:

# OpenAI via Cloudflare AI Gateway
provider: "cloudflare-gateway:openai:gpt-4o"

# Anthropic via Cloudflare AI Gateway
provider: "cloudflare-gateway:anthropic:claude-3-5-sonnet-20241022"

# Cloudflare Workers AI
provider: "cloudflare-gateway:workers-ai:@cf/meta/llama-3.3-70b-instruct-fp8-fast"

# Groq via Cloudflare AI Gateway
provider: "cloudflare-gateway:groq:llama-3.3-70b-versatile"

Provider Fields

All fields available on a providers.targets[] entry for Cloudflare AI Gateway:

FieldTypeDefaultDescription
idstringrequiredUnique identifier for this target. Used in logs, the console dashboard, and routing decisions.
providerstringrequiredProvider ID in the form "cloudflare-gateway:<gateway-provider>:<model>".
cloudflare_account_idstringnoneCloudflare account ID (literal value). Use cloudflare_account_id_env to reference an env var instead. Alias: accountId.
cloudflare_account_id_envstringnoneEnvironment variable holding the Cloudflare account ID. Alias: accountIdEnvar.
cloudflare_gateway_idstringnoneCloudflare gateway ID (literal value). Use cloudflare_gateway_id_env to reference an env var instead. Alias: gatewayId.
cloudflare_gateway_id_envstringnoneEnvironment variable holding the Cloudflare gateway ID. Alias: gatewayIdEnvar.
secret_key_refobjectnoneObject reference to the environment variable holding the Cloudflare API token. Auth is optional for public or upstream-authenticated gateways.
base_urlstringauto-derivedExplicit base URL override. When set, takes precedence over the derived URL. Format: https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/{provider}.
formatstring"openai"Wire format. Cloudflare AI Gateway uses OpenAI-compatible format for most backends.
timeout_secondsinteger60Maximum wall-clock time for non-streaming requests before the gateway returns a timeout error.
stream_timeout_secondsintegerinherits timeout_secondsMaximum wall-clock time for streaming requests.
descriptionstringnoneHuman-readable label shown in the console dashboard and health-check output.
weightfloat1.0Routing weight used by the weighted_round_robin strategy.
health_probeobjectnoneActive health probe configuration. Sub-fields: enabled (bool), interval_seconds (int), timeout_seconds (int).

Azure OpenAI via Cloudflare additionally supports:

FieldTypeDescription
resource_name / resourceNamestringAzure OpenAI resource name for Cloudflare Azure gateway path derivation.
deployment_name / deploymentNamestringAzure OpenAI deployment name.

Supported Models

Cloudflare AI Gateway supports models from multiple providers. The model identifier you use must match what the underlying backend expects:

Gateway ProviderExample Models
openaigpt-4o, gpt-4o-mini, o1-preview
anthropicclaude-3-5-sonnet-20241022, claude-3-haiku-20240307
groqllama-3.3-70b-versatile, mixtral-8x7b-32768
mistralmistral-large-latest, mistral-small-latest
workers-ai@cf/meta/llama-3.3-70b-instruct-fp8-fast, @cf/mistral/mistral-7b-instruct-v0.1, @cf/google/gemma-7b-it
perplexity-aillama-3.1-sonar-large-128k-online
coherecommand-r-plus
google-ai-studiogemini-2.0-flash, gemini-1.5-pro

See the Cloudflare AI Gateway documentation for the full list of supported providers and model identifiers.

Client Examples

Once the gateway is running, point your client SDK to http://localhost:8080 instead of the Cloudflare gateway URL. The standard OpenAI SDK works directly for all backends that use the OpenAI wire format.

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="unused", # auth is handled by Keeptrusts via CLOUDFLARE_API_TOKEN
)

response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain how Cloudflare AI Gateway caching works."},
],
temperature=0.7,
max_tokens=512,
)

print(response.choices[0].message.content)

Streaming

Keeptrusts fully supports streaming for all Cloudflare AI Gateway backends. Set stream: true in your request — the gateway applies policies to each chunk in real time.

pack:
name: cloudflare-gateway-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: cf-gateway-streaming
provider: cloudflare-gateway:openai:gpt-4o
secret_key_ref:
env: CLOUDFLARE_API_TOKEN
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="unused")

stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize Cloudflare's approach to AI safety."}],
stream=True,
)

for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)

Advanced Configuration

Workers AI

Route to Cloudflare Workers AI models hosted at Cloudflare's edge. For workers-ai, Keeptrusts derives the URL as .../workers-ai/<model> and passes the full @cf/... model identifier:

pack:
name: cloudflare-gateway-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: cf-workers-ai-llama
provider: cloudflare-gateway:workers-ai:@cf/meta/llama-3.3-70b-instruct-fp8-fast
secret_key_ref:
env: CLOUDFLARE_API_TOKEN
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Azure OpenAI via Cloudflare

For Azure OpenAI backends, set resource_name and deployment_name so Keeptrusts can derive the Azure-specific Cloudflare gateway path:

pack:
name: cloudflare-gateway-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: cf-gateway-azure
provider: cloudflare-gateway:azure-openai:gpt-4o
secret_key_ref:
env: CLOUDFLARE_API_TOKEN
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Multi-Backend Fallback

Fall back from Cloudflare Gateway to a direct provider if the gateway is unavailable:

pack:
name: cloudflare-gateway-providers-6
version: 1.0.0
enabled: true
providers:
targets:
- id: cf-gateway-primary
provider: cloudflare-gateway:openai:gpt-4o
secret_key_ref:
env: CLOUDFLARE_API_TOKEN
- id: openai-direct-fallback
provider: openai:chat:gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Best Practices

  • Use env vars for account and gateway IDs — set cloudflare_account_id_env and cloudflare_gateway_id_env rather than hardcoding values in policy-config.yaml. This keeps credentials out of source control.
  • Let Keeptrusts derive the URL — omit base_url and let the gateway construct the correct gateway URL from account and gateway IDs. Only set base_url explicitly if you need to override the standard Cloudflare gateway hostname.
  • Layer Keeptrusts policies with Cloudflare analytics — Cloudflare AI Gateway provides request logging and analytics at the edge; Keeptrusts adds policy enforcement and compliance audit trails at the application layer. Using both gives you defense in depth.
  • Configure Workers AI model IDs precisely — Workers AI model identifiers use the @cf/ prefix format (e.g. @cf/meta/llama-3.3-70b-instruct-fp8-fast). The exact identifier must match what Cloudflare's Workers AI API expects.
  • Set stream_timeout_seconds for large models — Workers AI models and large upstream models accessed via the Cloudflare gateway can have higher first-token latencies at peak load. Set stream_timeout_seconds to at least 180.
  • Test backend availability independently — Cloudflare AI Gateway's health is independent of the underlying model providers it routes to. Use health_probe and a fallback strategy to handle upstream model unavailability gracefully.

For AI systems

  • Canonical terms: Keeptrusts gateway, Cloudflare AI Gateway, Workers AI, edge inference, provider target, policy-config.yaml.
  • Config field names: provider, model, base_url, secret_key_ref.env, format: "openai", cloudflare_account_id, cloudflare_gateway_id, stream_timeout_seconds.
  • Key behavior: Keeptrusts sits in front of Cloudflare AI Gateway, adding policy enforcement to Cloudflare's caching, rate limiting, and analytics.
  • Best next pages: OpenRouter integration (alternative aggregator), Provider routing, Policy configuration.

For engineers

  • Prerequisites: Cloudflare account with AI Gateway enabled, account ID and gateway ID, API token with Workers AI permissions, kt CLI installed.
  • Start command: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml.
  • Set stream_timeout_seconds to at least 180 — Workers AI and upstream models via Cloudflare can have variable first-token latencies at peak load.
  • Cloudflare AI Gateway health is independent of upstream model providers — use health_probe and fallback strategy to handle upstream model unavailability.
  • Validate: curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"@cf/meta/llama-3-8b-instruct","messages":[{"role":"user","content":"hello"}]}'.

For leaders

  • Cloudflare AI Gateway provides edge caching and built-in rate limiting — Keeptrusts adds policy enforcement and audit logging on top.
  • Running Keeptrusts in front of Cloudflare gives you vendor-independent policy controls that persist even if you switch edge providers.
  • Cloudflare's global edge network reduces latency for geographically distributed users; Keeptrusts policies execute before traffic reaches Cloudflare.
  • Monitor both Cloudflare analytics and Keeptrusts events dashboard to get complete visibility into request flow and policy decisions.

Next steps