Google AI Studio (Gemini)

Keeptrusts gateways Google AI Studio's Gemini API with full policy enforcement, audit logging, and automatic format translation. Clients can send requests in standard OpenAI format — Keeptrusts translates them to Google's native Gemini wire format on the fly and translates responses back. Direct Gemini-format requests are also supported natively. Both chat completions and embedding endpoints are proxied.

Use this page when

You need the exact command, config, API, or integration details for Google AI Studio (Gemini).
You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.

Primary audience

Primary: AI Agents, Technical Engineers
Secondary: Technical Leaders

Prerequisites

Google AI Studio API key — obtain one from Google AI Studio.
Keeptrusts CLI — install kt (quickstart guide).
Export your API key:

export GOOGLE_API_KEY="AIzaSy..."

Keeptrusts auto-detects GOOGLE_API_KEY when provider is set to "google-ai-studio". The correct query-parameter auth and Gemini base URL are applied automatically.

Configuration

Create a policy-config.yaml with your provider targets:

pack:
  name: gemini-gateway
  version: 1.0.0
  enabled: true
policies:
  chain:
  - prompt-injection
  - pii-detector
  - safety-filter
  - audit-logger
policy:
  prompt-injection:
    threshold: 0.8
    action: block
  pii-detector:
    action: redact
  safety-filter:
    mode: strict
    action: block
  audit-logger:
    retention_days: 365
providers:
  strategy: single
  targets:
  - id: gemini-25-pro
    provider: google-ai-studio
    model: gemini-2.5-pro
    base_url: https://generativelanguage.googleapis.com
    secret_key_ref:
      env: GOOGLE_API_KEY

Start the gateway:

kt gateway run \
  --listen 0.0.0.0:41002 \
  --policy-config policy-config.yaml

Provider Fields

All fields available on a providers.targets[] entry for Google AI Studio:

Field	Type	Default	Description
`id`	string	required	Unique identifier for this target
`provider`	string	required	Provider ID: `"google-ai-studio"` or `"google-ai-studio:chat:gemini-2.5-pro"`
`model`	string	required	Model name, e.g. `"gemini-2.5-pro"`, `"gemini-2.5-flash"`
`base_url`	string	`https://generativelanguage.googleapis.com`	API base URL (auto-detected for google-ai-studio)
`secret_key_ref`	object	`GOOGLE_API_KEY`	Object reference to the environment variable holding the API key
`timeout_seconds`	integer	`60`	Maximum time for non-streaming requests
`stream_timeout_seconds`	integer	none	Maximum time for streaming requests; falls back to `timeout_seconds`
`max_context_tokens`	integer	none	Maximum tokens in the context window (used for context compression)
`headers`	map	`{}`	Additional HTTP headers sent with each request
`format`	string	`"google-gemini"`	Wire format: `"google-gemini"` (auto-translates to/from OpenAI)
`provider_type`	string	`"google-ai-studio"`	Explicit provider type; overrides URL heuristic detection
`description`	string	none	Human-readable description for dashboards and logs
`weight`	float	`1.0`	Routing weight for `weighted_round_robin` strategy
`data_policy`	object	none	Data handling policy (`zero_data_retention`, `training_opt_out`, `retention_days`)
`pricing`	object	none	Token pricing in USD per 1M tokens (`prompt`, `completion`)
`health_probe`	object	none	Active health probe configuration

Authentication

Google AI Studio uses API-key authentication passed as a query parameter. Keeptrusts auto-detects this when provider is "google-ai-studio":

# These are the defaults — you only need to set secret_key_ref
secret_key_ref:
  env: "GOOGLE_API_KEY"

Google AI Studio does not use Authorization: Bearer headers. The key is appended as ?key=<value> in the request URL. Keeptrusts handles this automatically — you never need to construct the query parameter yourself.

For Google Cloud Vertex AI (which uses OAuth2/service account authentication instead of API keys), see the Google Vertex AI integration guide.

Supported Models

Model	Context Window	Notes
`gemini-2.5-pro`	1M	Most capable reasoning model, hybrid thinking
`gemini-2.5-flash`	1M	Fast, cost-effective with thinking capabilities
`gemini-2.0-flash`	1M	Previous-gen fast model
`gemini-2.0-flash-lite`	1M	Lightweight, lowest cost
`gemini-1.5-pro`	2M	Legacy, largest context window
`gemini-1.5-flash`	1M	Legacy fast model
`gemini-1.5-flash-8b`	1M	Legacy, smallest model

Any model available on the Google AI Studio API can be used — set the model field to the model ID string. Keeptrusts passes the model identifier through to the upstream without validation.

Client Examples

Once the gateway is running, point your client to http://localhost:8080 instead of https://generativelanguage.googleapis.com. Clients send requests in OpenAI format — Keeptrusts translates to Gemini wire format automatically.

Python
Node.js
cURL

from openai import OpenAI

# Use the OpenAI SDK — Keeptrusts translates to Gemini format automatically
client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="unused",  # auth is handled by Keeptrusts via GOOGLE_API_KEY
)

response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the theory of relativity simply."},
    ],
    temperature=0.7,
    max_tokens=512,
)

print(response.choices[0].message.content)

import OpenAI from "openai";

// OpenAI SDK works — Keeptrusts handles format translation
const client = new OpenAI({
  baseURL: "http://localhost:8080/v1",
  apiKey: "unused", // auth handled by Keeptrusts via GOOGLE_API_KEY
});

const response = await client.chat.completions.create({
  model: "gemini-2.5-pro",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain the theory of relativity simply." },
  ],
  temperature: 0.7,
  max_tokens: 512,
});

console.log(response.choices[0].message.content);

# OpenAI-compatible format — Keeptrusts translates to Gemini wire format
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain the theory of relativity simply."}
    ],
    "temperature": 0.7,
    "max_tokens": 512
  }'

Streaming

Keeptrusts fully supports Gemini's streaming mode. Set stream: true in your request — the gateway applies policies to each chunk in real time and translates streaming events between Gemini SSE and OpenAI SSE formats.

Configure a separate streaming timeout to accommodate long-running Gemini generations (especially with thinking models):

pack:
  name: google-ai-studio-providers-3
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-streaming
    provider: google-ai-studio
    model: gemini-2.5-pro
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Python
cURL

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="unused")

stream = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Write a short story about AI."}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Write a short story about AI."}],
    "stream": true
  }'

Advanced Configuration

Multi-Model Fallback

Automatically fail over from Gemini 2.5 Pro to Flash when the primary is unavailable:

pack:
  name: google-ai-studio-providers-4
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-pro-primary
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
  - id: gemini-flash-fallback
    provider: google-ai-studio
    model: gemini-2.5-flash
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Cross-Provider Fallback

Use Gemini as primary with OpenAI as fallback — format translation is handled automatically for both:

pack:
  name: google-ai-studio-providers-5
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-primary
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
  - id: openai-fallback
    provider: openai
    model: gpt-4o
    secret_key_ref:
      env: OPENAI_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Format Translation

Keeptrusts automatically translates between OpenAI and Gemini wire formats. Set format: "google-gemini" on the target — clients send standard OpenAI /v1/chat/completions requests and receive OpenAI-shaped responses:

pack:
  name: google-ai-studio-providers-6
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-translated
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

This means you can swap between OpenAI, Anthropic, and Gemini providers without changing your client code — only the config target changes.

OpenAI Concept	Gemini Equivalent
`messages`	`contents` with `parts`
`system` message	`systemInstruction`
`tools`	`tools` with `functionDeclarations`
`max_tokens`	`maxOutputTokens`
`temperature`	`temperature`
`choices[0].message`	`candidates[0].content`

Latency-Based Routing

Route each request to the provider target with the lowest observed latency:

pack:
  name: google-ai-studio-providers-7
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-pro
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
  - id: gemini-flash
    provider: google-ai-studio
    model: gemini-2.5-flash
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Circuit Breaker

Temporarily remove unhealthy targets from the rotation:

pack:
  name: google-ai-studio-providers-8
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-main
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Retry Policy

Retry transient failures automatically:

pack:
  name: google-ai-studio-providers-9
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-pro
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Context Compression

Automatically truncate conversation history to fit within the model's context window:

pack:
  name: google-ai-studio-providers-10
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-pro
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Zero Data Retention

Enforce that no prompt or completion data is stored by the provider:

pack:
  name: google-ai-studio-providers-11
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-zdr
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

A/B Testing Between Models

Split traffic across models with weighted routing:

pack:
  name: google-ai-studio-providers-12
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: variant-pro
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
  - id: variant-flash
    provider: google-ai-studio
    model: gemini-2.5-flash
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Rate Limiting

Enforce per-provider request rate limits:

pack:
  name: google-ai-studio-providers-13
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-pro
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Token Cost Tracking

Declare pricing for cost dashboards and budget alerts:

pack:
  name: google-ai-studio-providers-14
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: gemini-pro
    provider: google-ai-studio
    model: gemini-2.5-pro
    secret_key_ref:
      env: GOOGLE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Google AI Studio vs Vertex AI

Feature	AI Studio	Vertex AI
Auth	API key	OAuth2 / Service Account
Data residency	Google-managed	GCP project region
Enterprise features	Limited	Full (VPC-SC, CMEK, etc.)
Pricing	Free tier available	Pay-per-use
Rate limits	Lower	Higher (adjustable)

Use Google AI Studio for development and prototyping. Use Vertex AI for production deployments that require data residency, VPC controls, or higher rate limits.

Best Practices

Format translation is automatic — use OpenAI SDKs against Gemini endpoints without code changes; only the config target changes.
Use stream_timeout_seconds for streaming — Gemini thinking models (2.5 Pro, 2.5 Flash) can take significantly longer than non-thinking models.
Set max_context_tokens below the actual model limit to leave headroom for the response. Gemini 2.5 Pro supports 1M tokens but you should set max_context_tokens to ~900000.
Enable health probes on production targets so routing strategies can react to Google API outages.
Use data_policy to document and enforce your organization's data handling requirements — especially important for enterprise Gemini usage.
Prefer fallback strategy for critical workloads; use latency or weighted_round_robin for cost/performance optimization.
Separate API keys per environment — use distinct secret_key_ref values for dev, staging, and production.
Declare pricing even if approximate — it enables cost dashboards and per-request budget enforcement.
Consider Vertex AI for enterprise deployments that require OAuth2/service account auth, VPC-SC, or CMEK encryption — see the Google Vertex AI guide.

For AI systems

Canonical terms: Keeptrusts gateway, Google AI Studio, Gemini, Gemini API, provider target, policy-config.yaml, provider: "google-ai-studio", GOOGLE_AI_API_KEY.
Config field names: provider, model, base_url, secret_key_ref.env: "GOOGLE_AI_API_KEY", format, provider_type: "google-ai-studio", pricing.
Provider shorthand: google-ai-studio:chat:<model> (e.g., google-ai-studio:chat:gemini-2.0-flash).
Key behavior: Keeptrusts translates between OpenAI format and Google's Gemini API, handling API key auth.
Best next pages: Google Vertex AI integration (enterprise tier), OpenAI integration, Policy configuration.

For engineers

Prerequisites: Google AI Studio API key (GOOGLE_AI_API_KEY from aistudio.google.com), kt CLI installed.
Start command: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml.
Validate: curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"gemini-2.0-flash","messages":[{"role":"user","content":"hello"}]}'.
Google AI Studio uses API key auth (simpler than Vertex AI's OAuth2/service account flow).
For enterprise features (VPC-SC, CMEK, service accounts), use Google Vertex AI instead.
Declare pricing fields for cost dashboard accuracy even if approximate.

For leaders

Google AI Studio is the consumer/developer tier — faster to set up but lacks enterprise controls (VPC-SC, CMEK, IAM) available in Vertex AI.
Suitable for prototyping, development, and non-regulated workloads where API key auth is acceptable.
Gemini models offer competitive pricing and multimodal capabilities (text, image, video, audio).
For regulated or production workloads, evaluate Google Vertex AI for its enterprise security controls.

Next steps

Google Vertex AI integration — enterprise GCP deployment with OAuth2 and VPC-SC
OpenAI integration — compare with GPT-4o
Anthropic integration — compare with Claude models
Policy configuration — prompt-injection, PII, and safety policy reference
Quickstart — install kt and run your first gateway

Use this page when​

Primary audience​

Prerequisites​

Configuration​

Provider Fields​

Authentication​

Supported Models​

Client Examples​

Streaming​

Advanced Configuration​

Multi-Model Fallback​

Cross-Provider Fallback​

Format Translation​

Latency-Based Routing​

Circuit Breaker​

Retry Policy​

Context Compression​

Zero Data Retention​

A/B Testing Between Models​

Rate Limiting​

Token Cost Tracking​

Google AI Studio vs Vertex AI​

Best Practices​

For AI systems​

For engineers​

For leaders​

Next steps​