Chat API Integration Guide
Beyond the interactive Chat Workbench, Keeptrusts provides API endpoints for programmatic chat interactions. This enables you to embed governed AI chat directly into your applications, automate testing workflows, and build custom chat interfaces — all with the same policy enforcement as the Chat Workbench.
Use this page when
- You are embedding governed AI chat into a custom application using the Keeptrusts API.
- You need to manage conversations programmatically (create, list, retrieve history).
- You are building automated testing workflows against the chat completions endpoint.
- You want to configure webhooks for real-time chat event notifications.
Primary audience
- Primary: AI Engineers integrating chat into applications, Backend Developers building custom chat UIs
- Secondary: QA Engineers automating chat test scenarios, Platform Engineers configuring webhooks
Authentication for Chat API
Gateway Keys
Programmatic chat uses gateway keys (kt_gk_...) for authentication. These keys route requests through the Keeptrusts gateway with full policy enforcement.
Creating a Gateway Key
- Navigate to Settings → Gateway Keys in the console.
- Click Create Gateway Key.
- Configure the key scope (user, team, or organization).
- Set an expiration period.
- Copy the generated key — it is shown only once.
Using a Gateway Key
Include the gateway key in your API requests:
curl -X POST "$GATEWAY_URL/v1/chat/completions" \
-H "Authorization: Bearer kt_gk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What are our data retention policies?"}
]
}'
API Bearer Tokens
For management operations (creating conversations, listing history), use standard API bearer tokens:
curl "$API_URL/v1/chat/conversations" \
-H "Authorization: Bearer $API_TOKEN"
Chat Completions Endpoint
The primary chat endpoint follows the OpenAI-compatible format, making it easy to integrate with existing tools and libraries.
Request Format
POST $GATEWAY_URL/v1/chat/completions
{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful compliance assistant."},
{"role": "user", "content": "Summarize the latest audit findings."}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}
Response Format
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1745395200,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Based on the latest audit report..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 45,
"completion_tokens": 280,
"total_tokens": 325
}
}
Streaming Responses
For streaming responses, set "stream": true:
curl -X POST "$GATEWAY_URL/v1/chat/completions" \
-H "Authorization: Bearer kt_gk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Explain our security policy."}],
"stream": true
}'
Streaming returns Server-Sent Events (SSE) with incremental content chunks.
Conversation Management
Creating a Conversation
POST $API_URL/v1/chat/conversations
{
"title": "Compliance Review Q2 2026",
"model": "gpt-4o"
}
Listing Conversations
GET $API_URL/v1/chat/conversations
Returns a paginated list of conversations accessible to the authenticated user.
Retrieving Conversation History
GET $API_URL/v1/chat/conversations/{conversation_id}/messages
Returns all messages in the conversation, including metadata about policy evaluations and citations.
Deleting a Conversation
DELETE $API_URL/v1/chat/conversations/{conversation_id}
Removes the conversation from the user's history. Decision events in the audit trail are retained according to the configured retention policy.
Embedding Chat in Applications
JavaScript / TypeScript
Use the OpenAI-compatible SDK with the Keeptrusts gateway URL:
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'kt_gk_your_key_here',
baseURL: 'https://gateway.example.com/v1',
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'What are the compliance requirements for Q2?' },
],
});
console.log(response.choices[0].message.content);
Python
from openai import OpenAI
client = OpenAI(
api_key="kt_gk_your_key_here",
base_url="https://gateway.example.com/v1",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "Summarize the latest risk assessment."}
],
)
print(response.choices[0].message.content)
cURL
curl -X POST "https://gateway.example.com/v1/chat/completions" \
-H "Authorization: Bearer kt_gk_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "List our active policies."}]
}'
Policy Enforcement in API Calls
API chat requests receive the same governance as the Chat Workbench:
- Input policies evaluate the prompt before forwarding to the LLM.
- Output policies evaluate the response before returning it to the caller.
- Blocked requests return HTTP 409 with a policy explanation.
- Escalated requests may be held pending moderator review.
Handling Policy Blocks
When a request is blocked, the API returns:
{
"error": {
"code": "policy_blocked",
"message": "Request blocked by policy 'pii-detection': Prompt contains personal identifiable information.",
"policy": "pii-detection"
}
}
Handle this in your application by catching the 409 status code:
try {
const response = await client.chat.completions.create({ /* ... */ });
} catch (error) {
if (error.status === 409) {
console.log('Blocked by policy:', error.message);
// Show user-friendly message or retry with modified prompt
}
}
Rate Limiting and Quotas
API chat requests respect the same rate limits and wallet budgets as interactive chat:
- Token budgets: Requests that would exceed the wallet balance are rejected.
- Rate limits: Per-user and per-team rate limits apply.
- Concurrent requests: The gateway limits concurrent requests per key.
Check rate limit headers in the response:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1745395260
Webhook Notifications
Configure webhooks to receive notifications about chat events:
- Navigate to Settings → Webhooks in the console.
- Create a webhook endpoint for chat events.
- Select event types:
chat.message,chat.blocked,chat.escalated. - Provide your webhook URL and secret.
Webhook payloads include the same metadata as decision events.
Best Practices
| Practice | Why It Matters |
|---|---|
| Use gateway keys, not bearer tokens, for chat | Gateway keys enforce policy; bearer tokens bypass the gateway |
| Handle 409 blocks gracefully | Provides a good user experience when policies intervene |
| Implement exponential backoff | Handles rate limiting and provider errors gracefully |
| Rotate gateway keys regularly | Limits exposure if a key is compromised |
| Log conversation IDs | Enables correlation between your app and Keeptrusts audit trail |
| Use streaming for interactive UIs | Improves perceived responsiveness |
Next steps
- Customize model defaults and parameters in Customizing the Chat Experience.
- Build advanced conversation patterns in Advanced Chat Patterns.
- Export API chat data for compliance in Chat Export for Compliance & Audit.
For AI systems
- Canonical terms: chat API, gateway key, chat completions endpoint, conversation management, webhook events, OpenAI-compatible format.
- Endpoints:
POST $GATEWAY_URL/v1/chat/completions,GET /v1/chat/conversations,POST /v1/chat/conversations. Webhook events:chat.message,chat.blocked,chat.escalated. - Key prefix:
kt_gk_...for gateway keys. Authentication: Bearer token inAuthorizationheader. - Best next pages: Customizing the Chat Experience, Advanced Chat Patterns, Chat Export.
For engineers
- Use gateway keys (
kt_gk_...) for chat traffic and standard API bearer tokens for management operations (list/create conversations). - The chat completions endpoint follows the OpenAI format — any OpenAI-compatible SDK works with a
base_urlchange. - Handle
409responses gracefully — policy blocks are deterministic; do not retry the same input. - Implement exponential backoff for
429rate limit responses. - Configure webhooks in Settings → Webhooks to receive real-time notifications for
chat.blockedandchat.escalatedevents. - Rotate gateway keys every 30-90 days and scope them to the minimum required models.
For leaders
- Programmatic chat integration brings the same governance guarantees as the interactive Chat Workbench to custom applications.
- Gateway keys provide granular control over which models and policies each application can access.
- Webhook integration enables real-time alerting on policy violations without polling.
- API-based chat creates the same audit trail as interactive chat — no governance gap between channels.