Google AI Studio (Gemini)
Keeptrusts gateways Google AI Studio's Gemini API with full policy enforcement, audit logging, and automatic format translation. Clients can send requests in standard OpenAI format — Keeptrusts translates them to Google's native Gemini wire format on the fly and translates responses back. Direct Gemini-format requests are also supported natively. Both chat completions and embedding endpoints are proxied.
Use this page when
- You need the exact command, config, API, or integration details for Google AI Studio (Gemini).
- You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
- If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Prerequisites
- Google AI Studio API key — obtain one from Google AI Studio.
- Keeptrusts CLI — install
kt(quickstart guide). - Export your API key:
export GOOGLE_API_KEY="AIzaSy..."
Keeptrusts auto-detects GOOGLE_API_KEY when provider is set to "google-ai-studio". The correct query-parameter auth and Gemini base URL are applied automatically.
Configuration
Create a policy-config.yaml with your provider targets:
pack:
name: gemini-gateway
version: 1.0.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
- safety-filter
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
safety-filter:
mode: strict
action: block
audit-logger:
retention_days: 365
providers:
strategy: single
targets:
- id: gemini-25-pro
provider: google-ai-studio
model: gemini-2.5-pro
base_url: https://generativelanguage.googleapis.com
secret_key_ref:
env: GOOGLE_API_KEY
Start the gateway:
kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml
Provider Fields
All fields available on a providers.targets[] entry for Google AI Studio:
| Field | Type | Default | Description |
|---|---|---|---|
id | string | required | Unique identifier for this target |
provider | string | required | Provider ID: "google-ai-studio" or "google-ai-studio:chat:gemini-2.5-pro" |
model | string | required | Model name, e.g. "gemini-2.5-pro", "gemini-2.5-flash" |
base_url | string | https://generativelanguage.googleapis.com | API base URL (auto-detected for google-ai-studio) |
secret_key_ref | object | GOOGLE_API_KEY | Object reference to the environment variable holding the API key |
timeout_seconds | integer | 60 | Maximum time for non-streaming requests |
stream_timeout_seconds | integer | none | Maximum time for streaming requests; falls back to timeout_seconds |
max_context_tokens | integer | none | Maximum tokens in the context window (used for context compression) |
headers | map | {} | Additional HTTP headers sent with each request |
format | string | "google-gemini" | Wire format: "google-gemini" (auto-translates to/from OpenAI) |
provider_type | string | "google-ai-studio" | Explicit provider type; overrides URL heuristic detection |
description | string | none | Human-readable description for dashboards and logs |
weight | float | 1.0 | Routing weight for weighted_round_robin strategy |
data_policy | object | none | Data handling policy (zero_data_retention, training_opt_out, retention_days) |
pricing | object | none | Token pricing in USD per 1M tokens (prompt, completion) |
health_probe | object | none | Active health probe configuration |
Authentication
Google AI Studio uses API-key authentication passed as a query parameter. Keeptrusts auto-detects this when provider is "google-ai-studio":
# These are the defaults — you only need to set secret_key_ref
secret_key_ref:
env: "GOOGLE_API_KEY"
Authorization: Bearer headers. The key is appended as ?key=<value> in the request URL. Keeptrusts handles this automatically — you never need to construct the query parameter yourself.For Google Cloud Vertex AI (which uses OAuth2/service account authentication instead of API keys), see the Google Vertex AI integration guide.
Supported Models
| Model | Context Window | Notes |
|---|---|---|
gemini-2.5-pro | 1M | Most capable reasoning model, hybrid thinking |
gemini-2.5-flash | 1M | Fast, cost-effective with thinking capabilities |
gemini-2.0-flash | 1M | Previous-gen fast model |
gemini-2.0-flash-lite | 1M | Lightweight, lowest cost |
gemini-1.5-pro | 2M | Legacy, largest context window |
gemini-1.5-flash | 1M | Legacy fast model |
gemini-1.5-flash-8b | 1M | Legacy, smallest model |
Any model available on the Google AI Studio API can be used — set the model field to the model ID string. Keeptrusts passes the model identifier through to the upstream without validation.
Client Examples
Once the gateway is running, point your client to http://localhost:8080 instead of https://generativelanguage.googleapis.com. Clients send requests in OpenAI format — Keeptrusts translates to Gemini wire format automatically.
- Python
- Node.js
- cURL
from openai import OpenAI
# Use the OpenAI SDK — Keeptrusts translates to Gemini format automatically
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="unused", # auth is handled by Keeptrusts via GOOGLE_API_KEY
)
response = client.chat.completions.create(
model="gemini-2.5-pro",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the theory of relativity simply."},
],
temperature=0.7,
max_tokens=512,
)
print(response.choices[0].message.content)
import OpenAI from "openai";
// OpenAI SDK works — Keeptrusts handles format translation
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "unused", // auth handled by Keeptrusts via GOOGLE_API_KEY
});
const response = await client.chat.completions.create({
model: "gemini-2.5-pro",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain the theory of relativity simply." },
],
temperature: 0.7,
max_tokens: 512,
});
console.log(response.choices[0].message.content);
# OpenAI-compatible format — Keeptrusts translates to Gemini wire format
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-pro",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain the theory of relativity simply."}
],
"temperature": 0.7,
"max_tokens": 512
}'
Streaming
Keeptrusts fully supports Gemini's streaming mode. Set stream: true in your request — the gateway applies policies to each chunk in real time and translates streaming events between Gemini SSE and OpenAI SSE formats.
Configure a separate streaming timeout to accommodate long-running Gemini generations (especially with thinking models):
pack:
name: google-ai-studio-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-streaming
provider: google-ai-studio
model: gemini-2.5-pro
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
- Python
- cURL
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="unused")
stream = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[{"role": "user", "content": "Write a short story about AI."}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "gemini-2.5-flash",
"messages": [{"role": "user", "content": "Write a short story about AI."}],
"stream": true
}'
Advanced Configuration
Multi-Model Fallback
Automatically fail over from Gemini 2.5 Pro to Flash when the primary is unavailable:
pack:
name: google-ai-studio-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-pro-primary
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
- id: gemini-flash-fallback
provider: google-ai-studio
model: gemini-2.5-flash
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Cross-Provider Fallback
Use Gemini as primary with OpenAI as fallback — format translation is handled automatically for both:
pack:
name: google-ai-studio-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-primary
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
- id: openai-fallback
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Format Translation
Keeptrusts automatically translates between OpenAI and Gemini wire formats. Set format: "google-gemini" on the target — clients send standard OpenAI /v1/chat/completions requests and receive OpenAI-shaped responses:
pack:
name: google-ai-studio-providers-6
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-translated
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
This means you can swap between OpenAI, Anthropic, and Gemini providers without changing your client code — only the config target changes.
| OpenAI Concept | Gemini Equivalent |
|---|---|
messages | contents with parts |
system message | systemInstruction |
tools | tools with functionDeclarations |
max_tokens | maxOutputTokens |
temperature | temperature |
choices[0].message | candidates[0].content |
Latency-Based Routing
Route each request to the provider target with the lowest observed latency:
pack:
name: google-ai-studio-providers-7
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-pro
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
- id: gemini-flash
provider: google-ai-studio
model: gemini-2.5-flash
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Circuit Breaker
Temporarily remove unhealthy targets from the rotation:
pack:
name: google-ai-studio-providers-8
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-main
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Retry Policy
Retry transient failures automatically:
pack:
name: google-ai-studio-providers-9
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-pro
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Context Compression
Automatically truncate conversation history to fit within the model's context window:
pack:
name: google-ai-studio-providers-10
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-pro
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Zero Data Retention
Enforce that no prompt or completion data is stored by the provider:
pack:
name: google-ai-studio-providers-11
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-zdr
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
A/B Testing Between Models
Split traffic across models with weighted routing:
pack:
name: google-ai-studio-providers-12
version: 1.0.0
enabled: true
providers:
targets:
- id: variant-pro
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
- id: variant-flash
provider: google-ai-studio
model: gemini-2.5-flash
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Rate Limiting
Enforce per-provider request rate limits:
pack:
name: google-ai-studio-providers-13
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-pro
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Token Cost Tracking
Declare pricing for cost dashboards and budget alerts:
pack:
name: google-ai-studio-providers-14
version: 1.0.0
enabled: true
providers:
targets:
- id: gemini-pro
provider: google-ai-studio
model: gemini-2.5-pro
secret_key_ref:
env: GOOGLE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Google AI Studio vs Vertex AI
| Feature | AI Studio | Vertex AI |
|---|---|---|
| Auth | API key | OAuth2 / Service Account |
| Data residency | Google-managed | GCP project region |
| Enterprise features | Limited | Full (VPC-SC, CMEK, etc.) |
| Pricing | Free tier available | Pay-per-use |
| Rate limits | Lower | Higher (adjustable) |
Use Google AI Studio for development and prototyping. Use Vertex AI for production deployments that require data residency, VPC controls, or higher rate limits.
Best Practices
- Format translation is automatic — use OpenAI SDKs against Gemini endpoints without code changes; only the config target changes.
- Use
stream_timeout_secondsfor streaming — Gemini thinking models (2.5 Pro, 2.5 Flash) can take significantly longer than non-thinking models. - Set
max_context_tokensbelow the actual model limit to leave headroom for the response. Gemini 2.5 Pro supports 1M tokens but you should setmax_context_tokensto ~900000. - Enable health probes on production targets so routing strategies can react to Google API outages.
- Use
data_policyto document and enforce your organization's data handling requirements — especially important for enterprise Gemini usage. - Prefer
fallbackstrategy for critical workloads; uselatencyorweighted_round_robinfor cost/performance optimization. - Separate API keys per environment — use distinct
secret_key_refvalues for dev, staging, and production. - Declare
pricingeven if approximate — it enables cost dashboards and per-request budget enforcement. - Consider Vertex AI for enterprise deployments that require OAuth2/service account auth, VPC-SC, or CMEK encryption — see the Google Vertex AI guide.
For AI systems
- Canonical terms: Keeptrusts gateway, Google AI Studio, Gemini, Gemini API, provider target, policy-config.yaml,
provider: "google-ai-studio", GOOGLE_AI_API_KEY. - Config field names:
provider,model,base_url,secret_key_ref.env: "GOOGLE_AI_API_KEY",format,provider_type: "google-ai-studio",pricing. - Provider shorthand:
google-ai-studio:chat:<model>(e.g.,google-ai-studio:chat:gemini-2.0-flash). - Key behavior: Keeptrusts translates between OpenAI format and Google's Gemini API, handling API key auth.
- Best next pages: Google Vertex AI integration (enterprise tier), OpenAI integration, Policy configuration.
For engineers
- Prerequisites: Google AI Studio API key (
GOOGLE_AI_API_KEYfrom aistudio.google.com),ktCLI installed. - Start command:
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml. - Validate:
curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"gemini-2.0-flash","messages":[{"role":"user","content":"hello"}]}'. - Google AI Studio uses API key auth (simpler than Vertex AI's OAuth2/service account flow).
- For enterprise features (VPC-SC, CMEK, service accounts), use Google Vertex AI instead.
- Declare
pricingfields for cost dashboard accuracy even if approximate.
For leaders
- Google AI Studio is the consumer/developer tier — faster to set up but lacks enterprise controls (VPC-SC, CMEK, IAM) available in Vertex AI.
- Suitable for prototyping, development, and non-regulated workloads where API key auth is acceptable.
- Gemini models offer competitive pricing and multimodal capabilities (text, image, video, audio).
- For regulated or production workloads, evaluate Google Vertex AI for its enterprise security controls.
Next steps
- Google Vertex AI integration — enterprise GCP deployment with OAuth2 and VPC-SC
- OpenAI integration — compare with GPT-4o
- Anthropic integration — compare with Claude models
- Policy configuration — prompt-injection, PII, and safety policy reference
- Quickstart — install
ktand run your first gateway