Voyage AI
Voyage AI produces state-of-the-art embedding models purpose-built for retrieval-augmented generation, semantic search, and domain-specific ranking. Keeptrusts gateways the Voyage embeddings endpoint via its OpenAI-compatible surface, applying PII redaction, content filtering, and audit logging to every embedding request before the text leaves your environment. Voyage AI is an embedding-only provider — it does not expose a chat completions endpoint.
Use this page when
- You need the exact command, config, API, or integration details for Voyage AI.
- You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
- If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Prerequisites
- A Voyage AI API key (
pa-...format) - Keeptrusts CLI (
kt) installed and on yourPATH VOYAGE_API_KEYexported in your shell or injected via your secrets manager
Configuration
Minimal configuration
pack:
name: voyage-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: voyage-embed
provider: voyage:embedding:voyage-3-large
secret_key_ref:
env: VOYAGE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Full named configuration with policy chain
pack:
name: voyage-rag-pipeline
version: 1.0.0
enabled: true
policies:
chain:
- pii-detector
- dlp-filter
- audit-logger
policy:
pii-detector:
action: redact
entities:
- EMAIL
- PHONE
- SSN
- CREDIT_CARD
- ADDRESS
dlp-filter:
patterns:
- name: internal-ticket-ids
regex: TICKET-[0-9]{6}
action: redact
- name: api-keys
regex: "(sk-|pa-)[A-Za-z0-9]{32,}"
action: block
audit-logger:
retention_days: 365
include_embeddings: false
providers:
targets:
- id: voyage-embed
provider: voyage:embedding:voyage-3-large
base_url: https://api.voyageai.com/v1
secret_key_ref:
env: VOYAGE_API_KEY
Code embeddings
pack:
name: voyage-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: voyage-code
provider: voyage:embedding:voyage-code-3
secret_key_ref:
env: VOYAGE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Finance-domain embeddings
pack:
name: voyage-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: voyage-finance
provider: voyage:embedding:voyage-finance-2
secret_key_ref:
env: VOYAGE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Legal/RAG embeddings
pack:
name: voyage-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: voyage-legal
provider: voyage:embedding:voyage-law-2
secret_key_ref:
env: VOYAGE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Start the gateway
export VOYAGE_API_KEY="pa-your-api-key"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
Provider Fields
| Field | Required | Default | Description |
|---|---|---|---|
provider | Yes | — | Provider identifier. Use fully-qualified "voyage:embedding:<model>" or short "voyage". |
secret_key_ref | Yes | VOYAGE_API_KEY | Name of the env var holding the Voyage API key. Auto-detected if set to the default name. |
base_url | No | https://api.voyageai.com/v1 | Override the Voyage API base URL. |
format | No | openai | Wire format. Voyage endpoints are OpenAI-compatible for embeddings; this value should remain openai. |
options.input_type | No | null | Voyage-specific input type hint. Use search_document when indexing corpus documents and search_query when embedding user queries. Omit for classification or clustering use cases. |
options.truncation | No | true | Automatically truncate inputs that exceed the model's context limit. Set false to receive an error instead of silent truncation. |
options.output_dimension | No | Model default | For voyage-3-large, you can reduce output dimensions to 1024 or 256 for storage savings. |
options.output_dtype | No | float | Set to int8 or ubinary for cost-effective quantised vector storage. |
Supported Models
| Model | Dimensions | Context Window | Input (per 1M tokens) | Notes |
|---|---|---|---|---|
voyage-3-large | 1024 or 2048 | 32k | $0.18 | Highest retrieval quality; supports Matryoshka (variable dimensions) |
voyage-3 | 1024 | 32k | $0.06 | Balanced quality and cost; good default for general RAG |
voyage-3-lite | 512 | 32k | $0.02 | Fastest and cheapest; suited for high-throughput classification |
voyage-code-3 | 1024 | 32k | $0.18 | Optimised for code search and retrieval across polyglot repositories |
voyage-finance-2 | 1024 | 32k | $0.12 | Domain-tuned for financial documents, earnings reports, and SEC filings |
voyage-law-2 | 1024 | 16k | $0.12 | Optimised for legal contracts, case law, and long-form regulatory text |
Voyage AI does not expose a chat completions endpoint. Sending a /v1/chat/completions request through a Voyage-targeted gateway will return a 400 error. Use a separate provider target for chat and a Voyage target for embeddings.
Client Examples
- Python
- Node.js
- cURL
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="unused", # auth is handled by the gateway
)
# Index documents (use search_document input_type in provider config)
documents = [
"The EU AI Act establishes risk categories for AI systems.",
"Prohibited AI practices include social scoring by public authorities.",
"High-risk AI systems must undergo conformity assessment.",
]
doc_response = client.embeddings.create(
model="voyage-3-large",
input=documents,
)
doc_vectors = [item.embedding for item in doc_response.data]
print(f"Indexed {len(doc_vectors)} documents, {len(doc_vectors[0])} dims each")
# Query (swap provider config to search_query input_type for production)
query_response = client.embeddings.create(
model="voyage-3-large",
input=["What AI practices are prohibited?"],
)
query_vector = query_response.data[0].embedding
print(f"Query vector: {len(query_vector)} dims")
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "unused",
});
// Index documents
const documents = [
"The EU AI Act establishes risk categories for AI systems.",
"Prohibited AI practices include social scoring by public authorities.",
"High-risk AI systems must undergo conformity assessment.",
];
const docResponse = await client.embeddings.create({
model: "voyage-3-large",
input: documents,
});
const docVectors = docResponse.data.map((item) => item.embedding);
console.log(`Indexed ${docVectors.length} documents, ${docVectors[0].length} dims each`);
// Query
const queryResponse = await client.embeddings.create({
model: "voyage-3-large",
input: ["What AI practices are prohibited?"],
});
const queryVector = queryResponse.data[0].embedding;
console.log(`Query vector: ${queryVector.length} dims`);
# Embed documents
curl http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "voyage-3-large",
"input": [
"The EU AI Act establishes risk categories for AI systems.",
"Prohibited AI practices include social scoring by public authorities."
]
}'
# Embed a query
curl http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "voyage-3-large",
"input": ["What AI practices are prohibited?"]
}'
Streaming
Voyage AI embedding endpoints do not support streaming — embeddings are returned in a single response. Keeptrusts returns the full vector array synchronously after all policy checks pass. No special configuration is required.
Advanced Configuration
Variable-dimension embeddings (Matryoshka)
voyage-3-large supports Matryoshka representation learning, allowing you to reduce output dimensions without re-embedding your corpus. Use output_dimension to trade recall for storage and query cost:
pack:
name: voyage-providers-6
version: 1.0.0
enabled: true
providers:
targets:
- id: voyage-compact
provider: voyage:embedding:voyage-3-large
secret_key_ref:
env: VOYAGE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
For maximum storage efficiency, combine reduced dimensions with int8 quantisation:
options:
output_dimension: 256
output_dtype: int8 # 4× storage reduction vs float32 per dimension
Separate index and query targets
Voyage models perform best when you differentiate document-indexing requests (use input_type: search_document) from query-time requests (input_type: search_query). Use two named targets and route by context:
pack:
name: voyage-providers-8
version: 1.0.0
enabled: true
providers:
targets:
- id: voyage-indexer
provider: voyage:embedding:voyage-3
secret_key_ref:
env: VOYAGE_API_KEY
- id: voyage-querier
provider: voyage:embedding:voyage-3
secret_key_ref:
env: VOYAGE_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
PII-safe embedding pipeline
Apply pii-detector with redact action before any text reaches Voyage. This ensures personal data is stripped from vectors before they are stored in your vector database:
policies:
chain:
- pii-detector
- audit-logger
policy:
pii-detector:
action: redact
entities:
- EMAIL
- PHONE
- SSN
- ADDRESS
- DATE_OF_BIRTH
audit-logger:
retention_days: 365
include_embeddings: false
providers:
targets:
- id: voyage-safe
provider: voyage:embedding:voyage-3-large
secret_key_ref:
env: VOYAGE_API_KEY
Best Practices
-
Always set
input_type— Voyage models are trained with separate index and query representations. Using the wronginput_type(or omitting it) can degrade retrieval precision by 5–15% versus correctly differentiated calls. Maintain separate gateway targets for indexing and querying. -
Redact PII before embedding — Vectors encode semantic content including personal information. Apply the
pii-detectorpolicy withaction: redactso that personal data is removed before it is encoded into vectors that will be stored in your vector database. -
Use domain-specific models for regulated content —
voyage-finance-2andvoyage-law-2significantly outperform general models on regulatory and contract retrieval benchmarks. Match the model to your content domain rather than defaulting tovoyage-3. -
Audit embedding requests, not vectors — Enable
audit-loggerbut setinclude_embeddings: false. Logging raw embedding vectors provides no audit value and inflates storage costs substantially. Log the input text hash and metadata instead. -
Pin the model version — Use fully-qualified identifiers like
voyage:embedding:voyage-3-largerather than barevoyage. This prevents unintended model swaps when Voyage AI changes their default and keeps your retrieval quality stable. -
Monitor truncation in production — Set
truncation: falseon query targets so that over-length queries surface as errors rather than being silently truncated. Truncated queries produce semantically incomplete vectors that can degrade retrieval quality unpredictably.
For AI systems
- Canonical terms: Keeptrusts gateway, Voyage AI, embeddings, vector search, voyage-3, voyage-code-3, truncation, provider target, policy-config.yaml,
provider: "voyage". - Config field names:
provider,model,base_url: "https://api.voyageai.com/v1",secret_key_ref.env: "VOYAGE_API_KEY",format,truncation. - Key behavior: Keeptrusts routes embedding requests to Voyage AI and applies PII redaction before text is vectorized.
- Best next pages: Cohere integration (alternative embeddings), OpenAI integration, Policy configuration.
For engineers
- Prerequisites: Voyage AI API key (
VOYAGE_API_KEYenv var from dash.voyageai.com),ktCLI installed. - Start command:
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml. - Validate:
curl http://localhost:8080/v1/embeddings -H 'Content-Type: application/json' -d '{"model":"voyage-3","input":["hello world"]}'. - Set
truncation: falseon query targets so over-length queries surface as errors rather than being silently truncated. - Monitor truncation in production — truncated queries produce semantically incomplete vectors that degrade retrieval quality.
- Separate embedding targets from chat targets in your config to apply different policy rules per workload.
For leaders
- Voyage AI provides purpose-built embedding models optimized for retrieval — critical infrastructure for RAG pipelines.
- PII redaction before vectorization ensures sensitive data is never encoded into persistent vector indexes.
- Silent truncation can degrade search quality unpredictably — configure explicit failure on over-length queries for production reliability.
- Embedding costs are per-token;
max_context_tokenscaps prevent unexpected cost spikes from large documents.
Next steps
- Cohere integration — alternative embedding and reranking models
- OpenAI integration — OpenAI embeddings as a comparison/fallback
- Policy configuration — PII redaction for embedding pipelines
- Quickstart — install
ktand run your first gateway