Skip to main content

OpenAI SDK Drop-In Compatibility

The Keeptrusts unified gateway exposes an OpenAI-compatible POST /v1/chat/completions endpoint. Any standard OpenAI SDK can connect by changing two values: base_url and api_key. No wrapper library or custom client is required.

Use this page when

  • You want to route existing OpenAI SDK calls through Keeptrusts for governance, cost control, or multi-provider routing.
  • You are migrating from direct OpenAI access to a governed gateway.
  • You need to verify which features work with which providers through the unified endpoint.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Quick setup

Every OpenAI SDK accepts a base URL override and an API key. Point both at your Keeptrusts gateway:

ParameterValue
base_urlhttps://<your-gateway>/v1 or http://localhost:8080/v1 for local
api_keyYour Keeptrusts API key (Access Key or Gateway Key)

The gateway transparently resolves the requested model, enforces policy, and routes to the configured upstream provider.

Python

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="kt_your_keeptrusts_api_key",
)

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello from Keeptrusts!"}],
)
print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain governance in one paragraph."}],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content or ""
print(content, end="", flush=True)

Function calling

tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"],
},
},
}
]

response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in London?"}],
tools=tools,
)

Node.js / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "kt_your_keeptrusts_api_key",
});

const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello from Keeptrusts!" }],
});
console.log(response.choices[0].message.content);

Streaming

const stream = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Explain governance." }],
stream: true,
});

for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Go

package main

import (
"context"
"fmt"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)

func main() {
client := openai.NewClient(
option.WithBaseURL("http://localhost:8080/v1"),
option.WithAPIKey("kt_your_keeptrusts_api_key"),
)

resp, err := client.Chat.Completions.New(context.Background(),
openai.ChatCompletionNewParams{
Model: "gpt-4o",
Messages: []openai.ChatCompletionMessageParamUnion{
openai.UserMessage("Hello from Keeptrusts!"),
},
},
)
if err != nil {
panic(err)
}
fmt.Println(resp.Choices[0].Message.Content)
}

Rust

use reqwest::Client;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = Client::new();
let resp = client
.post("http://localhost:8080/v1/chat/completions")
.header("Authorization", "Bearer kt_your_keeptrusts_api_key")
.json(&json!({
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello from Keeptrusts!"}]
}))
.send()
.await?;

let body: serde_json::Value = resp.json().await?;
println!("{}", body["choices"][0]["message"]["content"]);
Ok(())
}

Supported features matrix

FeatureStatusNotes
Chat completionsSupportedAll providers via unified endpoint
Streaming (SSE)SupportedNormalized to OpenAI chunk format
Function / tool callsSupportedRequires model support (tool_calls capability)
JSON modeSupportedresponse_format: { type: "json_object" }
JSON schema modeSupportedresponse_format: { type: "json_schema", ... }
Vision (image input)SupportedProvider must support image input modality
LogprobsPartialPassed through when provider supports it
Seed / determinismPartialPassed through; not all providers honor it
EmbeddingsNot yetPlanned for a future phase
Audio (TTS / STT)Not yetPlanned for a future phase
Fine-tuning APINot yetUse provider directly for fine-tuning
Assistants / ThreadsNot yetUse provider directly
Batch APINot yetUse provider directly

Provider compatibility matrix

The unified endpoint normalizes provider-specific formats transparently. Each provider has specific capabilities and limitations:

ProviderChatStreamingToolsJSON modeVisionNotes
OpenAIYesYesYesYesYesFull compatibility
AnthropicYesYesYesYesYesmax_completion_tokens mapped to max_tokens
Google GeminiYesYesYesYesYesn mapped to candidate_count
AWS BedrockYesYesYesPartialModel-dependentUses SigV4 auth internally
Azure OpenAIYesYesYesYesYesSame as OpenAI format
Together AIYesYesYesYesNoOpenAI-compatible format
GroqYesYesYesYesNoOpenAI-compatible format
MistralYesYesYesYesNoOpenAI-compatible format
CohereYesYesYesPartialNoResponse format translated
Fireworks AIYesYesYesYesNoOpenAI-compatible format

Multi-provider routing

The unified endpoint supports transparent provider routing. Specify the model and optionally pin a provider with a prefix:

# Auto-route to the cheapest provider for this model
client.chat.completions.create(model="gpt-4o", ...)

# Pin to a specific provider
client.chat.completions.create(model="openai/gpt-4o", ...)
client.chat.completions.create(model="anthropic/claude-4-sonnet", ...)

Model aliasing

Family aliases resolve to the latest stable version:

AliasResolves to
claude-sonnetclaude-4-sonnet
claude-opusclaude-4-opus
gemini-progemini-2.5-pro
gemini-flashgemini-2.5-flash

Deprecated models are automatically routed to their successor. The response includes Deprecation: true and X-Model-Deprecated-Successor headers.

Response metadata

Every response includes a x_keeptrusts object with gateway metadata:

{
"x_keeptrusts": {
"routing_decision_id": "kt_decision_req-abc123",
"provider_used": "openai",
"cache_status": null,
"region": "us-east-1"
}
}

Rate-limit headers

The gateway injects rate-limit headers on every response:

HeaderDescription
X-RateLimit-Limit-RequestsMaximum requests per minute
X-RateLimit-Remaining-RequestsRemaining requests in current window
X-RateLimit-Limit-TokensMaximum tokens per minute
X-RateLimit-Remaining-TokensRemaining tokens in current window
X-RateLimit-ResetSeconds until the window resets
X-RateLimit-Scopeorg or key

When budget utilization exceeds 80%, the response also includes X-Budget-Remaining-USD and X-Budget-Limit-USD headers.

Error format

Errors follow the OpenAI error shape regardless of upstream provider:

{
"error": {
"message": "Model 'nonexistent' not found in catalog",
"type": "invalid_request_error",
"param": "model",
"code": null
}
}

What's next