Chat API Integration Guide

Beyond the interactive Chat Workbench, Keeptrusts provides API endpoints for programmatic chat interactions. This enables you to embed governed AI chat directly into your applications, automate testing workflows, and build custom chat interfaces — all with the same policy enforcement as the Chat Workbench.

Use this page when

You are embedding governed AI chat into a custom application using the Keeptrusts API.
You need to manage conversations programmatically (create, list, retrieve history).
You are building automated testing workflows against the chat completions endpoint.
You want to configure webhooks for real-time chat event notifications.

Primary audience

Primary: AI Engineers integrating chat into applications, Backend Developers building custom chat UIs
Secondary: QA Engineers automating chat test scenarios, Platform Engineers configuring webhooks

Authentication for Chat API

Gateway Keys

Programmatic chat uses gateway keys (kt_gk_...) for authentication. These keys route requests through the Keeptrusts gateway with full policy enforcement.

Creating a Gateway Key

Navigate to Settings → Gateway Keys in the console.
Click Create Gateway Key.
Configure the key scope (user, team, or organization).
Set an expiration period.
Copy the generated key — it is shown only once.

Using a Gateway Key

Include the gateway key in your API requests:

curl -X POST "$GATEWAY_URL/v1/chat/completions" \
  -H "Authorization: Bearer kt_gk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "What are our data retention policies?"}
    ]
  }'

API Bearer Tokens

For management operations (creating conversations, listing history), use standard API bearer tokens:

curl "$API_URL/v1/chat/conversations" \
  -H "Authorization: Bearer $API_TOKEN"

Chat Completions Endpoint

The primary chat endpoint follows the OpenAI-compatible format, making it easy to integrate with existing tools and libraries.

Request Format

POST $GATEWAY_URL/v1/chat/completions

{
  "model": "gpt-4o",
  "messages": [
    {"role": "system", "content": "You are a helpful compliance assistant."},
    {"role": "user", "content": "Summarize the latest audit findings."}
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}

Response Format

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1745395200,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Based on the latest audit report..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 45,
    "completion_tokens": 280,
    "total_tokens": 325
  }
}

Streaming Responses

For streaming responses, set "stream": true:

curl -X POST "$GATEWAY_URL/v1/chat/completions" \
  -H "Authorization: Bearer kt_gk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Explain our security policy."}],
    "stream": true
  }'

Streaming returns Server-Sent Events (SSE) with incremental content chunks.

Conversation Management

Creating a Conversation

POST $API_URL/v1/chat/conversations

{
  "title": "Compliance Review Q2 2026",
  "model": "gpt-4o"
}

Listing Conversations

GET $API_URL/v1/chat/conversations

Returns a paginated list of conversations accessible to the authenticated user.

Retrieving Conversation History

GET $API_URL/v1/chat/conversations/{conversation_id}/messages

Returns all messages in the conversation, including metadata about policy evaluations and citations.

Deleting a Conversation

DELETE $API_URL/v1/chat/conversations/{conversation_id}

Removes the conversation from the user's history. Decision events in the audit trail are retained according to the configured retention policy.

Embedding Chat in Applications

JavaScript / TypeScript

Use the OpenAI-compatible SDK with the Keeptrusts gateway URL:

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'kt_gk_your_key_here',
  baseURL: 'https://gateway.example.com/v1',
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'What are the compliance requirements for Q2?' },
  ],
});

console.log(response.choices[0].message.content);

Python

from openai import OpenAI

client = OpenAI(
    api_key="kt_gk_your_key_here",
    base_url="https://gateway.example.com/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Summarize the latest risk assessment."}
    ],
)

print(response.choices[0].message.content)

cURL

curl -X POST "https://gateway.example.com/v1/chat/completions" \
  -H "Authorization: Bearer kt_gk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "List our active policies."}]
  }'

Policy Enforcement in API Calls

API chat requests receive the same governance as the Chat Workbench:

Input policies evaluate the prompt before forwarding to the LLM.
Output policies evaluate the response before returning it to the caller.
Blocked requests return HTTP 409 with a policy explanation.
Escalated requests may be held pending moderator review.

Handling Policy Blocks

When a request is blocked, the API returns:

{
  "error": {
    "code": "policy_blocked",
    "message": "Request blocked by policy 'pii-detection': Prompt contains personal identifiable information.",
    "policy": "pii-detection"
  }
}

Handle this in your application by catching the 409 status code:

try {
  const response = await client.chat.completions.create({ /* ... */ });
} catch (error) {
  if (error.status === 409) {
    console.log('Blocked by policy:', error.message);
    // Show user-friendly message or retry with modified prompt
  }
}

Rate Limiting and Quotas

API chat requests respect the same rate limits and wallet budgets as interactive chat:

Token budgets: Requests that would exceed the wallet balance are rejected.
Rate limits: Per-user and per-team rate limits apply.
Concurrent requests: The gateway limits concurrent requests per key.

Check rate limit headers in the response:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1745395260

Webhook Notifications

Configure webhooks to receive notifications about chat events:

Navigate to Settings → Webhooks in the console.
Create a webhook endpoint for chat events.
Select event types: chat.message, chat.blocked, chat.escalated.
Provide your webhook URL and secret.

Webhook payloads include the same metadata as decision events.

Best Practices

Practice	Why It Matters
Use gateway keys, not bearer tokens, for chat	Gateway keys enforce policy; bearer tokens bypass the gateway
Handle 409 blocks gracefully	Provides a good user experience when policies intervene
Implement exponential backoff	Handles rate limiting and provider errors gracefully
Rotate gateway keys regularly	Limits exposure if a key is compromised
Log conversation IDs	Enables correlation between your app and Keeptrusts audit trail
Use streaming for interactive UIs	Improves perceived responsiveness

Next steps

Customize model defaults and parameters in Customizing the Chat Experience.
Build advanced conversation patterns in Advanced Chat Patterns.
Export API chat data for compliance in Chat Export for Compliance & Audit.

For AI systems

Canonical terms: chat API, gateway key, chat completions endpoint, conversation management, webhook events, OpenAI-compatible format.
Endpoints: POST $GATEWAY_URL/v1/chat/completions, GET /v1/chat/conversations, POST /v1/chat/conversations. Webhook events: chat.message, chat.blocked, chat.escalated.
Key prefix: kt_gk_... for gateway keys. Authentication: Bearer token in Authorization header.
Best next pages: Customizing the Chat Experience, Advanced Chat Patterns, Chat Export.

For engineers

Use gateway keys (kt_gk_...) for chat traffic and standard API bearer tokens for management operations (list/create conversations).
The chat completions endpoint follows the OpenAI format — any OpenAI-compatible SDK works with a base_url change.
Handle 409 responses gracefully — policy blocks are deterministic; do not retry the same input.
Implement exponential backoff for 429 rate limit responses.
Configure webhooks in Settings → Webhooks to receive real-time notifications for chat.blocked and chat.escalated events.
Rotate gateway keys every 30-90 days and scope them to the minimum required models.

For leaders

Programmatic chat integration brings the same governance guarantees as the interactive Chat Workbench to custom applications.
Gateway keys provide granular control over which models and policies each application can access.
Webhook integration enables real-time alerting on policy violations without polling.
API-based chat creates the same audit trail as interactive chat — no governance gap between channels.

Use this page when​

Primary audience​

Authentication for Chat API​

Gateway Keys​

Creating a Gateway Key​

Using a Gateway Key​

API Bearer Tokens​

Chat Completions Endpoint​

Request Format​

Response Format​

Streaming Responses​

Conversation Management​

Creating a Conversation​

Listing Conversations​

Retrieving Conversation History​

Deleting a Conversation​

Embedding Chat in Applications​

JavaScript / TypeScript​

Python​

cURL​

Policy Enforcement in API Calls​

Handling Policy Blocks​

Rate Limiting and Quotas​

Webhook Notifications​

Best Practices​

Next steps​

For AI systems​

For engineers​

For leaders​