OpenAI SDK Drop-In Compatibility

The Keeptrusts unified gateway exposes an OpenAI-compatible POST /v1/chat/completions endpoint. Any standard OpenAI SDK can connect by changing two values: base_url and api_key. No wrapper library or custom client is required.

Use this page when

You want to route existing OpenAI SDK calls through Keeptrusts for governance, cost control, or multi-provider routing.
You are migrating from direct OpenAI access to a governed gateway.
You need to verify which features work with which providers through the unified endpoint.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Quick setup

Every OpenAI SDK accepts a base URL override and an API key. Point both at your Keeptrusts gateway:

Parameter	Value
`base_url`	`https://<your-gateway>/v1` or `http://localhost:8080/v1` for local
`api_key`	Your Keeptrusts API key (Access Key or Gateway Key)

The gateway transparently resolves the requested model, enforces policy, and routes to the configured upstream provider.

Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="kt_your_keeptrusts_api_key",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from Keeptrusts!"}],
)
print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain governance in one paragraph."}],
    stream=True,
)
for chunk in stream:
    content = chunk.choices[0].delta.content or ""
    print(content, end="", flush=True)

Function calling

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in London?"}],
    tools=tools,
)

Node.js / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:8080/v1",
  apiKey: "kt_your_keeptrusts_api_key",
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello from Keeptrusts!" }],
});
console.log(response.choices[0].message.content);

Streaming

const stream = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Explain governance." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Go

package main

import (
    "context"
    "fmt"
    "github.com/openai/openai-go"
    "github.com/openai/openai-go/option"
)

func main() {
    client := openai.NewClient(
        option.WithBaseURL("http://localhost:8080/v1"),
        option.WithAPIKey("kt_your_keeptrusts_api_key"),
    )

    resp, err := client.Chat.Completions.New(context.Background(),
        openai.ChatCompletionNewParams{
            Model: "gpt-4o",
            Messages: []openai.ChatCompletionMessageParamUnion{
                openai.UserMessage("Hello from Keeptrusts!"),
            },
        },
    )
    if err != nil {
        panic(err)
    }
    fmt.Println(resp.Choices[0].Message.Content)
}

Rust

use reqwest::Client;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::new();
    let resp = client
        .post("http://localhost:8080/v1/chat/completions")
        .header("Authorization", "Bearer kt_your_keeptrusts_api_key")
        .json(&json!({
            "model": "gpt-4o",
            "messages": [{"role": "user", "content": "Hello from Keeptrusts!"}]
        }))
        .send()
        .await?;

    let body: serde_json::Value = resp.json().await?;
    println!("{}", body["choices"][0]["message"]["content"]);
    Ok(())
}

Supported features matrix

Feature	Status	Notes
Chat completions	Supported	All providers via unified endpoint
Streaming (SSE)	Supported	Normalized to OpenAI chunk format
Function / tool calls	Supported	Requires model support (`tool_calls` capability)
JSON mode	Supported	`response_format: { type: "json_object" }`
JSON schema mode	Supported	`response_format: { type: "json_schema", ... }`
Vision (image input)	Supported	Provider must support image input modality
Logprobs	Partial	Passed through when provider supports it
Seed / determinism	Partial	Passed through; not all providers honor it
Embeddings	Not yet	Planned for a future phase
Audio (TTS / STT)	Not yet	Planned for a future phase
Fine-tuning API	Not yet	Use provider directly for fine-tuning
Assistants / Threads	Not yet	Use provider directly
Batch API	Not yet	Use provider directly

Provider compatibility matrix

The unified endpoint normalizes provider-specific formats transparently. Each provider has specific capabilities and limitations:

Provider	Chat	Streaming	Tools	JSON mode	Vision	Notes
OpenAI	Yes	Yes	Yes	Yes	Yes	Full compatibility
Anthropic	Yes	Yes	Yes	Yes	Yes	`max_completion_tokens` mapped to `max_tokens`
Google Gemini	Yes	Yes	Yes	Yes	Yes	`n` mapped to `candidate_count`
AWS Bedrock	Yes	Yes	Yes	Partial	Model-dependent	Uses SigV4 auth internally
Azure OpenAI	Yes	Yes	Yes	Yes	Yes	Same as OpenAI format
Together AI	Yes	Yes	Yes	Yes	No	OpenAI-compatible format
Groq	Yes	Yes	Yes	Yes	No	OpenAI-compatible format
Mistral	Yes	Yes	Yes	Yes	No	OpenAI-compatible format
Cohere	Yes	Yes	Yes	Partial	No	Response format translated
Fireworks AI	Yes	Yes	Yes	Yes	No	OpenAI-compatible format

Multi-provider routing

The unified endpoint supports transparent provider routing. Specify the model and optionally pin a provider with a prefix:

# Auto-route to the cheapest provider for this model
client.chat.completions.create(model="gpt-4o", ...)

# Pin to a specific provider
client.chat.completions.create(model="openai/gpt-4o", ...)
client.chat.completions.create(model="anthropic/claude-4-sonnet", ...)

Model aliasing

Family aliases resolve to the latest stable version:

Alias	Resolves to
`claude-sonnet`	`claude-4-sonnet`
`claude-opus`	`claude-4-opus`
`gemini-pro`	`gemini-2.5-pro`
`gemini-flash`	`gemini-2.5-flash`

Deprecated models are automatically routed to their successor. The response includes Deprecation: true and X-Model-Deprecated-Successor headers.

Response metadata

Every response includes a x_keeptrusts object with gateway metadata:

{
  "x_keeptrusts": {
    "routing_decision_id": "kt_decision_req-abc123",
    "provider_used": "openai",
    "cache_status": null,
    "region": "us-east-1"
  }
}

Rate-limit headers

The gateway injects rate-limit headers on every response:

Header	Description
`X-RateLimit-Limit-Requests`	Maximum requests per minute
`X-RateLimit-Remaining-Requests`	Remaining requests in current window
`X-RateLimit-Limit-Tokens`	Maximum tokens per minute
`X-RateLimit-Remaining-Tokens`	Remaining tokens in current window
`X-RateLimit-Reset`	Seconds until the window resets
`X-RateLimit-Scope`	`org` or `key`

When budget utilization exceeds 80%, the response also includes X-Budget-Remaining-USD and X-Budget-Limit-USD headers.

Error format

Errors follow the OpenAI error shape regardless of upstream provider:

{
  "error": {
    "message": "Model 'nonexistent' not found in catalog",
    "type": "invalid_request_error",
    "param": "model",
    "code": null
  }
}

What's next

Multi-provider setup — configure credentials for multiple providers
Error handling — handle gateway errors
Streaming patterns — advanced streaming usage
Cost optimization — leverage routing policies to reduce spend

Use this page when​

Primary audience​

Quick setup​

Python​

Streaming​

Function calling​

Node.js / TypeScript​

Streaming​

Go​

Rust​

Supported features matrix​

Provider compatibility matrix​

Multi-provider routing​

Model aliasing​

Response metadata​

Rate-limit headers​

Error format​

What's next​

Use this page when

Primary audience

Quick setup

Python

Streaming

Function calling

Node.js / TypeScript

Streaming

Go

Rust

Supported features matrix

Provider compatibility matrix

Multi-provider routing

Model aliasing

Response metadata

Rate-limit headers

Error format

What's next