Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Chat API Integration Guide

Beyond the interactive Chat Workbench, Keeptrusts provides API endpoints for programmatic chat interactions. This enables you to embed governed AI chat directly into your applications, automate testing workflows, and build custom chat interfaces — all with the same policy enforcement as the Chat Workbench.

Use this page when

  • You are embedding governed AI chat into a custom application using the Keeptrusts API.
  • You need to manage conversations programmatically (create, list, retrieve history).
  • You are building automated testing workflows against the chat completions endpoint.
  • You want to configure webhooks for real-time chat event notifications.

Primary audience

  • Primary: AI Engineers integrating chat into applications, Backend Developers building custom chat UIs
  • Secondary: QA Engineers automating chat test scenarios, Platform Engineers configuring webhooks

Authentication for Chat API

Gateway Keys

Programmatic chat uses gateway keys (kt_gk_...) for authentication. These keys route requests through the Keeptrusts gateway with full policy enforcement.

Creating a Gateway Key

  1. Navigate to Settings → Gateway Keys in the console.
  2. Click Create Gateway Key.
  3. Configure the key scope (user, team, or organization).
  4. Set an expiration period.
  5. Copy the generated key — it is shown only once.

Using a Gateway Key

Include the gateway key in your API requests:

curl -X POST "$GATEWAY_URL/v1/chat/completions" \
-H "Authorization: Bearer kt_gk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What are our data retention policies?"}
]
}'

API Bearer Tokens

For management operations (creating conversations, listing history), use standard API bearer tokens:

curl "$API_URL/v1/chat/conversations" \
-H "Authorization: Bearer $API_TOKEN"

Chat Completions Endpoint

The primary chat endpoint follows the OpenAI-compatible format, making it easy to integrate with existing tools and libraries.

Request Format

POST $GATEWAY_URL/v1/chat/completions
{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful compliance assistant."},
{"role": "user", "content": "Summarize the latest audit findings."}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}

Response Format

{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1745395200,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Based on the latest audit report..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 45,
"completion_tokens": 280,
"total_tokens": 325
}
}

Streaming Responses

For streaming responses, set "stream": true:

curl -X POST "$GATEWAY_URL/v1/chat/completions" \
-H "Authorization: Bearer kt_gk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Explain our security policy."}],
"stream": true
}'

Streaming returns Server-Sent Events (SSE) with incremental content chunks.

Conversation Management

Creating a Conversation

POST $API_URL/v1/chat/conversations
{
"title": "Compliance Review Q2 2026",
"model": "gpt-4o"
}

Listing Conversations

GET $API_URL/v1/chat/conversations

Returns a paginated list of conversations accessible to the authenticated user.

Retrieving Conversation History

GET $API_URL/v1/chat/conversations/{conversation_id}/messages

Returns all messages in the conversation, including metadata about policy evaluations and citations.

Deleting a Conversation

DELETE $API_URL/v1/chat/conversations/{conversation_id}

Removes the conversation from the user's history. Decision events in the audit trail are retained according to the configured retention policy.

Embedding Chat in Applications

JavaScript / TypeScript

Use the OpenAI-compatible SDK with the Keeptrusts gateway URL:

import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'kt_gk_your_key_here',
baseURL: 'https://gateway.example.com/v1',
});

const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'What are the compliance requirements for Q2?' },
],
});

console.log(response.choices[0].message.content);

Python

from openai import OpenAI

client = OpenAI(
api_key="kt_gk_your_key_here",
base_url="https://gateway.example.com/v1",
)

response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Summarize the latest risk assessment."}
],
)

print(response.choices[0].message.content)

cURL

curl -X POST "https://gateway.example.com/v1/chat/completions" \
-H "Authorization: Bearer kt_gk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "List our active policies."}]
}'

Policy Enforcement in API Calls

API chat requests receive the same governance as the Chat Workbench:

  • Input policies evaluate the prompt before forwarding to the LLM.
  • Output policies evaluate the response before returning it to the caller.
  • Blocked requests return HTTP 409 with a policy explanation.
  • Escalated requests may be held pending moderator review.

Handling Policy Blocks

When a request is blocked, the API returns:

{
"error": {
"code": "policy_blocked",
"message": "Request blocked by policy 'pii-detection': Prompt contains personal identifiable information.",
"policy": "pii-detection"
}
}

Handle this in your application by catching the 409 status code:

try {
const response = await client.chat.completions.create({ /* ... */ });
} catch (error) {
if (error.status === 409) {
console.log('Blocked by policy:', error.message);
// Show user-friendly message or retry with modified prompt
}
}

Rate Limiting and Quotas

API chat requests respect the same rate limits and wallet budgets as interactive chat:

  • Token budgets: Requests that would exceed the wallet balance are rejected.
  • Rate limits: Per-user and per-team rate limits apply.
  • Concurrent requests: The gateway limits concurrent requests per key.

Check rate limit headers in the response:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1745395260

Webhook Notifications

Configure webhooks to receive notifications about chat events:

  1. Navigate to Settings → Webhooks in the console.
  2. Create a webhook endpoint for chat events.
  3. Select event types: chat.message, chat.blocked, chat.escalated.
  4. Provide your webhook URL and secret.

Webhook payloads include the same metadata as decision events.

Best Practices

PracticeWhy It Matters
Use gateway keys, not bearer tokens, for chatGateway keys enforce policy; bearer tokens bypass the gateway
Handle 409 blocks gracefullyProvides a good user experience when policies intervene
Implement exponential backoffHandles rate limiting and provider errors gracefully
Rotate gateway keys regularlyLimits exposure if a key is compromised
Log conversation IDsEnables correlation between your app and Keeptrusts audit trail
Use streaming for interactive UIsImproves perceived responsiveness

Next steps

For AI systems

  • Canonical terms: chat API, gateway key, chat completions endpoint, conversation management, webhook events, OpenAI-compatible format.
  • Endpoints: POST $GATEWAY_URL/v1/chat/completions, GET /v1/chat/conversations, POST /v1/chat/conversations. Webhook events: chat.message, chat.blocked, chat.escalated.
  • Key prefix: kt_gk_... for gateway keys. Authentication: Bearer token in Authorization header.
  • Best next pages: Customizing the Chat Experience, Advanced Chat Patterns, Chat Export.

For engineers

  • Use gateway keys (kt_gk_...) for chat traffic and standard API bearer tokens for management operations (list/create conversations).
  • The chat completions endpoint follows the OpenAI format — any OpenAI-compatible SDK works with a base_url change.
  • Handle 409 responses gracefully — policy blocks are deterministic; do not retry the same input.
  • Implement exponential backoff for 429 rate limit responses.
  • Configure webhooks in Settings → Webhooks to receive real-time notifications for chat.blocked and chat.escalated events.
  • Rotate gateway keys every 30-90 days and scope them to the minimum required models.

For leaders

  • Programmatic chat integration brings the same governance guarantees as the interactive Chat Workbench to custom applications.
  • Gateway keys provide granular control over which models and policies each application can access.
  • Webhook integration enables real-time alerting on policy violations without polling.
  • API-based chat creates the same audit trail as interactive chat — no governance gap between channels.