QuiverAI
QuiverAI provides an OpenAI-compatible LLM gateway, enabling organizations to route requests through a managed AI infrastructure layer. Keeptrusts can sit in front of any QuiverAI endpoint and apply its full policy engine — prompt-injection detection, PII redaction, content safety filters, and audit logging — on every request and response.
Use this page when
- You need the exact command, config, API, or integration details for QuiverAI.
- You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
- If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.
Because QuiverAI exposes an OpenAI-compatible API, no format translation is required. Any OpenAI SDK client pointed at the Keeptrusts gateway will work without code changes.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Prerequisites
- QuiverAI API key and endpoint URL — obtain these from your QuiverAI account or your organization's QuiverAI administrator.
- Keeptrusts CLI — install
kt(quickstart guide). - Export your API key so the gateway can read it at startup:
export QUIVERAI_API_KEY="your-quiverai-api-key"
The base_url field must point at your QuiverAI endpoint. If your deployment uses the default QuiverAI cloud, set base_url to your organization's assigned gateway URL.
Configuration
A complete policy-config.yaml that routes traffic through QuiverAI with prompt-injection, PII, and safety policies:
pack:
name: quiverai-gateway
version: 1.0.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
- safety-filter
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
safety-filter:
mode: strict
action: block
audit-logger:
retention_days: 365
providers:
strategy: single
targets:
- id: quiverai-primary
provider: quiverai
model: gpt-4o
base_url: https://your-quiverai-endpoint/v1
secret_key_ref:
env: QUIVERAI_API_KEY
Start the gateway:
kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml
Compact Provider Shorthand
You can encode the model directly in the provider field. The two forms below are equivalent:
# Shorthand — model embedded in the provider string
- id: "quiverai-primary"
provider: "quiverai:chat:gpt-4o"
base_url: "https://your-quiverai-endpoint/v1"
# Explicit — separate provider and model fields
- id: "quiverai-primary"
provider: "quiverai"
model: "gpt-4o"
base_url: "https://your-quiverai-endpoint/v1"
Provider Fields
All fields available on a providers.targets[] entry for QuiverAI:
| Field | Type | Default | Description |
|---|---|---|---|
id | string | required | Unique identifier for this target. Used in logs, the console dashboard, and routing decisions. |
provider | string | required | Provider ID. Use "quiverai" or the shorthand "quiverai:chat:<model>". |
model | string | required | Model name as supported by your QuiverAI deployment, e.g. "gpt-4o". Passed through to the upstream as-is. |
base_url | string | required | URL to your QuiverAI gateway endpoint, e.g. https://your-quiverai-endpoint/v1. |
secret_key_ref | object | QUIVERAI_API_KEY | Object reference to the environment variable holding the QuiverAI API key. |
format | string | "openai" | Wire format. QuiverAI exposes an OpenAI-compatible API. |
timeout_seconds | integer | 60 | Maximum wall-clock time for non-streaming requests before the gateway returns a timeout error. |
stream_timeout_seconds | integer | inherits timeout_seconds | Maximum wall-clock time for streaming requests. Set higher for long-running streamed generations. |
max_context_tokens | integer | none | Maximum token budget for the request. When set, the gateway rejects requests that exceed this limit before forwarding upstream. |
description | string | none | Human-readable label shown in the console dashboard and health-check output. |
weight | float | 1.0 | Routing weight used by the weighted_round_robin strategy. |
health_probe | object | none | Active health probe configuration. Sub-fields: enabled (bool), interval_seconds (int), timeout_seconds (int). |
Supported Models
The models available depend on your QuiverAI deployment configuration. Because QuiverAI acts as a gateway, the models it exposes depend on which upstream providers it has been configured to route to. Common examples include OpenAI models (gpt-4o, gpt-4o-mini), Anthropic models, and open-weight models depending on your plan.
Contact your QuiverAI administrator or check your QuiverAI dashboard for the exact model identifiers available in your deployment.
model field through to the upstream endpoint as-is. Use the exact model identifier string that your QuiverAI deployment expects.Client Examples
Once the gateway is running, point your client SDK to http://localhost:8080 instead of your QuiverAI endpoint URL. The standard OpenAI SDK works directly.
- Python
- Node.js
- cURL
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="unused", # auth is handled by Keeptrusts via QUIVERAI_API_KEY
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What are the main pillars of a strong AI governance framework?"},
],
temperature=0.7,
max_tokens=512,
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "unused", // auth handled by Keeptrusts via QUIVERAI_API_KEY
});
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "What are the main pillars of a strong AI governance framework?" },
],
temperature: 0.7,
max_tokens: 512,
});
console.log(response.choices[0].message.content);
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What are the main pillars of a strong AI governance framework?"}
],
"temperature": 0.7,
"max_tokens": 512
}'
Streaming
Keeptrusts fully supports streaming for QuiverAI. Set stream: true in your request — the gateway applies policies to each chunk in real time, including content filtering and PII redaction on partial tokens.
pack:
name: quiverai-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: quiverai-streaming
provider: quiverai
model: gpt-4o
base_url: https://your-quiverai-endpoint/v1
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
- Python
- Node.js
- cURL
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="unused")
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Describe best practices for responsible AI deployment."}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "unused",
});
const stream = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Describe best practices for responsible AI deployment." }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Describe best practices for responsible AI deployment."}],
"stream": true
}'
Advanced Configuration
Fallback to Direct Provider
Route to a direct upstream provider if the QuiverAI gateway is unavailable:
pack:
name: quiverai-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: quiverai-primary
provider: quiverai:chat:gpt-4o
base_url: https://your-quiverai-endpoint/v1
secret_key_ref:
env: QUIVERAI_API_KEY
- id: openai-fallback
provider: openai:chat:gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Multiple QuiverAI Environments
If your organization has multiple QuiverAI environments (e.g. staging and production), define separate targets:
pack:
name: quiverai-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: quiverai-production
provider: quiverai:chat:gpt-4o
base_url: https://prod.your-quiverai-endpoint/v1
secret_key_ref:
env: QUIVERAI_PROD_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Best Practices
- Always specify
base_url— QuiverAI endpoints are organization-specific. There is no global default; always configure the full endpoint URL for your deployment. - Use environment variables for credentials — keep API keys out of
policy-config.yaml. Setsecret_key_refto a variable name and manage secrets via your environment or secrets manager. - Enable
health_probe— if your QuiverAI gateway has planned maintenance windows, health probing lets Keeptrusts detect unavailability and activate fallback routing automatically. - Layer PII redaction — even when routing through a managed gateway like QuiverAI, enabling the
pii-detectorpolicy ensures sensitive data is scrubbed before it leaves your application layer. - Match model identifiers exactly — QuiverAI may expose models under different identifiers than their native names. Confirm the exact model ID strings with your QuiverAI administrator before configuring the
modelfield.
For AI systems
- Canonical terms: Keeptrusts gateway, QuiverAI, managed AI gateway, enterprise deployment, provider target, policy-config.yaml.
- Config field names:
provider,model,base_url,secret_key_ref.env,format: "openai",timeout_seconds,health_probe. - Key behavior: Keeptrusts routes to QuiverAI's managed gateway, adding policy enforcement on top of QuiverAI's model routing.
- Best next pages: OpenRouter integration, Cloudflare AI Gateway integration, Policy configuration.
For engineers
- Prerequisites: QuiverAI account with API credentials, model endpoint configured,
ktCLI installed. - Start command:
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml. - Validate:
curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"your-quiverai-model","messages":[{"role":"user","content":"hello"}]}'. - Match model identifiers exactly — QuiverAI may expose models under different IDs than their native names. Confirm with your QuiverAI administrator.
- Layer
pii-detectorpolicy even when routing through a managed gateway — defense-in-depth redaction before data leaves your application layer.
For leaders
- QuiverAI provides managed model routing — Keeptrusts adds an independent policy enforcement layer you control.
- Vendor-independent governance: Keeptrusts policies persist even if you switch from QuiverAI to another gateway or direct provider.
- PII redaction before requests reach QuiverAI ensures sensitive data is controlled at your boundary, not the vendor's.
- Dual-gateway architecture (Keeptrusts + QuiverAI) provides separation of concerns: governance vs routing/serving.
Next steps
- OpenRouter integration — alternative multi-provider aggregation
- Cloudflare AI Gateway integration — alternative managed gateway with edge caching
- Provider routing strategies — Keeptrusts-native routing without a second gateway
- Policy configuration — PII redaction and prompt-injection reference
- Quickstart — install
ktand run your first gateway