Voyage AI

Voyage AI produces state-of-the-art embedding models purpose-built for retrieval-augmented generation, semantic search, and domain-specific ranking. Keeptrusts gateways the Voyage embeddings endpoint via its OpenAI-compatible surface, applying PII redaction, content filtering, and audit logging to every embedding request before the text leaves your environment. Voyage AI is an embedding-only provider — it does not expose a chat completions endpoint.

Use this page when

You need the exact command, config, API, or integration details for Voyage AI.
You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.

Primary audience

Primary: AI Agents, Technical Engineers
Secondary: Technical Leaders

Prerequisites

A Voyage AI API key (pa-... format)
Keeptrusts CLI (kt) installed and on your PATH
VOYAGE_API_KEY exported in your shell or injected via your secrets manager

Configuration

Minimal configuration

pack:
  name: voyage-providers-1
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: voyage-embed
    provider: voyage:embedding:voyage-3-large
    secret_key_ref:
      env: VOYAGE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Full named configuration with policy chain

pack:
  name: voyage-rag-pipeline
  version: 1.0.0
  enabled: true
policies:
  chain:
  - pii-detector
  - dlp-filter
  - audit-logger
policy:
  pii-detector:
    action: redact
    entities:
    - EMAIL
    - PHONE
    - SSN
    - CREDIT_CARD
    - ADDRESS
  dlp-filter:
    patterns:
    - name: internal-ticket-ids
      regex: TICKET-[0-9]{6}
      action: redact
    - name: api-keys
      regex: "(sk-|pa-)[A-Za-z0-9]{32,}"
      action: block
  audit-logger:
    retention_days: 365
    include_embeddings: false
providers:
  targets:
  - id: voyage-embed
    provider: voyage:embedding:voyage-3-large
    base_url: https://api.voyageai.com/v1
    secret_key_ref:
      env: VOYAGE_API_KEY

Code embeddings

pack:
  name: voyage-providers-3
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: voyage-code
    provider: voyage:embedding:voyage-code-3
    secret_key_ref:
      env: VOYAGE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Finance-domain embeddings

pack:
  name: voyage-providers-4
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: voyage-finance
    provider: voyage:embedding:voyage-finance-2
    secret_key_ref:
      env: VOYAGE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Legal/RAG embeddings

pack:
  name: voyage-providers-5
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: voyage-legal
    provider: voyage:embedding:voyage-law-2
    secret_key_ref:
      env: VOYAGE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Start the gateway

export VOYAGE_API_KEY="pa-your-api-key"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml

Provider Fields

Field	Required	Default	Description
`provider`	Yes	—	Provider identifier. Use fully-qualified `"voyage:embedding:<model>"` or short `"voyage"`.
`secret_key_ref`	Yes	`VOYAGE_API_KEY`	Name of the env var holding the Voyage API key. Auto-detected if set to the default name.
`base_url`	No	`https://api.voyageai.com/v1`	Override the Voyage API base URL.
`format`	No	`openai`	Wire format. Voyage endpoints are OpenAI-compatible for embeddings; this value should remain `openai`.
`options.input_type`	No	`null`	Voyage-specific input type hint. Use `search_document` when indexing corpus documents and `search_query` when embedding user queries. Omit for classification or clustering use cases.
`options.truncation`	No	`true`	Automatically truncate inputs that exceed the model's context limit. Set `false` to receive an error instead of silent truncation.
`options.output_dimension`	No	Model default	For `voyage-3-large`, you can reduce output dimensions to `1024` or `256` for storage savings.
`options.output_dtype`	No	`float`	Set to `int8` or `ubinary` for cost-effective quantised vector storage.

Supported Models

Model	Dimensions	Context Window	Input (per 1M tokens)	Notes
`voyage-3-large`	1024 or 2048	32k	$0.18	Highest retrieval quality; supports Matryoshka (variable dimensions)
`voyage-3`	1024	32k	$0.06	Balanced quality and cost; good default for general RAG
`voyage-3-lite`	512	32k	$0.02	Fastest and cheapest; suited for high-throughput classification
`voyage-code-3`	1024	32k	$0.18	Optimised for code search and retrieval across polyglot repositories
`voyage-finance-2`	1024	32k	$0.12	Domain-tuned for financial documents, earnings reports, and SEC filings
`voyage-law-2`	1024	16k	$0.12	Optimised for legal contracts, case law, and long-form regulatory text

Embedding-only provider

Voyage AI does not expose a chat completions endpoint. Sending a /v1/chat/completions request through a Voyage-targeted gateway will return a 400 error. Use a separate provider target for chat and a Voyage target for embeddings.

Client Examples

Python
Node.js
cURL

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="unused",  # auth is handled by the gateway
)

# Index documents (use search_document input_type in provider config)
documents = [
    "The EU AI Act establishes risk categories for AI systems.",
    "Prohibited AI practices include social scoring by public authorities.",
    "High-risk AI systems must undergo conformity assessment.",
]

doc_response = client.embeddings.create(
    model="voyage-3-large",
    input=documents,
)
doc_vectors = [item.embedding for item in doc_response.data]
print(f"Indexed {len(doc_vectors)} documents, {len(doc_vectors[0])} dims each")

# Query (swap provider config to search_query input_type for production)
query_response = client.embeddings.create(
    model="voyage-3-large",
    input=["What AI practices are prohibited?"],
)
query_vector = query_response.data[0].embedding
print(f"Query vector: {len(query_vector)} dims")

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:8080/v1",
  apiKey: "unused",
});

// Index documents
const documents = [
  "The EU AI Act establishes risk categories for AI systems.",
  "Prohibited AI practices include social scoring by public authorities.",
  "High-risk AI systems must undergo conformity assessment.",
];

const docResponse = await client.embeddings.create({
  model: "voyage-3-large",
  input: documents,
});
const docVectors = docResponse.data.map((item) => item.embedding);
console.log(`Indexed ${docVectors.length} documents, ${docVectors[0].length} dims each`);

// Query
const queryResponse = await client.embeddings.create({
  model: "voyage-3-large",
  input: ["What AI practices are prohibited?"],
});
const queryVector = queryResponse.data[0].embedding;
console.log(`Query vector: ${queryVector.length} dims`);

# Embed documents
curl http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "voyage-3-large",
    "input": [
      "The EU AI Act establishes risk categories for AI systems.",
      "Prohibited AI practices include social scoring by public authorities."
    ]
  }'

# Embed a query
curl http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "voyage-3-large",
    "input": ["What AI practices are prohibited?"]
  }'

Streaming

Voyage AI embedding endpoints do not support streaming — embeddings are returned in a single response. Keeptrusts returns the full vector array synchronously after all policy checks pass. No special configuration is required.

Advanced Configuration

Variable-dimension embeddings (Matryoshka)

voyage-3-large supports Matryoshka representation learning, allowing you to reduce output dimensions without re-embedding your corpus. Use output_dimension to trade recall for storage and query cost:

pack:
  name: voyage-providers-6
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: voyage-compact
    provider: voyage:embedding:voyage-3-large
    secret_key_ref:
      env: VOYAGE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

For maximum storage efficiency, combine reduced dimensions with int8 quantisation:

options:
  output_dimension: 256
  output_dtype: int8             # 4× storage reduction vs float32 per dimension

Separate index and query targets

Voyage models perform best when you differentiate document-indexing requests (use input_type: search_document) from query-time requests (input_type: search_query). Use two named targets and route by context:

pack:
  name: voyage-providers-8
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: voyage-indexer
    provider: voyage:embedding:voyage-3
    secret_key_ref:
      env: VOYAGE_API_KEY
  - id: voyage-querier
    provider: voyage:embedding:voyage-3
    secret_key_ref:
      env: VOYAGE_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

PII-safe embedding pipeline

Apply pii-detector with redact action before any text reaches Voyage. This ensures personal data is stripped from vectors before they are stored in your vector database:

policies:
  chain:
  - pii-detector
  - audit-logger
policy:
  pii-detector:
    action: redact
    entities:
    - EMAIL
    - PHONE
    - SSN
    - ADDRESS
    - DATE_OF_BIRTH
  audit-logger:
    retention_days: 365
    include_embeddings: false
providers:
  targets:
  - id: voyage-safe
    provider: voyage:embedding:voyage-3-large
    secret_key_ref:
      env: VOYAGE_API_KEY

Best Practices

Always set input_type — Voyage models are trained with separate index and query representations. Using the wrong input_type (or omitting it) can degrade retrieval precision by 5–15% versus correctly differentiated calls. Maintain separate gateway targets for indexing and querying.
Redact PII before embedding — Vectors encode semantic content including personal information. Apply the pii-detector policy with action: redact so that personal data is removed before it is encoded into vectors that will be stored in your vector database.
Use domain-specific models for regulated content — voyage-finance-2 and voyage-law-2 significantly outperform general models on regulatory and contract retrieval benchmarks. Match the model to your content domain rather than defaulting to voyage-3.
Audit embedding requests, not vectors — Enable audit-logger but set include_embeddings: false. Logging raw embedding vectors provides no audit value and inflates storage costs substantially. Log the input text hash and metadata instead.
Pin the model version — Use fully-qualified identifiers like voyage:embedding:voyage-3-large rather than bare voyage. This prevents unintended model swaps when Voyage AI changes their default and keeps your retrieval quality stable.
Monitor truncation in production — Set truncation: false on query targets so that over-length queries surface as errors rather than being silently truncated. Truncated queries produce semantically incomplete vectors that can degrade retrieval quality unpredictably.

For AI systems

Canonical terms: Keeptrusts gateway, Voyage AI, embeddings, vector search, voyage-3, voyage-code-3, truncation, provider target, policy-config.yaml, provider: "voyage".
Config field names: provider, model, base_url: "https://api.voyageai.com/v1", secret_key_ref.env: "VOYAGE_API_KEY", format, truncation.
Key behavior: Keeptrusts routes embedding requests to Voyage AI and applies PII redaction before text is vectorized.
Best next pages: Cohere integration (alternative embeddings), OpenAI integration, Policy configuration.

For engineers

Prerequisites: Voyage AI API key (VOYAGE_API_KEY env var from dash.voyageai.com), kt CLI installed.
Start command: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml.
Validate: curl http://localhost:8080/v1/embeddings -H 'Content-Type: application/json' -d '{"model":"voyage-3","input":["hello world"]}'.
Set truncation: false on query targets so over-length queries surface as errors rather than being silently truncated.
Monitor truncation in production — truncated queries produce semantically incomplete vectors that degrade retrieval quality.
Separate embedding targets from chat targets in your config to apply different policy rules per workload.

For leaders

Voyage AI provides purpose-built embedding models optimized for retrieval — critical infrastructure for RAG pipelines.
PII redaction before vectorization ensures sensitive data is never encoded into persistent vector indexes.
Silent truncation can degrade search quality unpredictably — configure explicit failure on over-length queries for production reliability.
Embedding costs are per-token; max_context_tokens caps prevent unexpected cost spikes from large documents.

Next steps

Cohere integration — alternative embedding and reranking models
OpenAI integration — OpenAI embeddings as a comparison/fallback
Policy configuration — PII redaction for embedding pipelines
Quickstart — install kt and run your first gateway

Use this page when​

Primary audience​

Prerequisites​

Configuration​

Minimal configuration​

Full named configuration with policy chain​

Code embeddings​

Finance-domain embeddings​

Legal/RAG embeddings​

Start the gateway​

Provider Fields​

Supported Models​

Client Examples​

Streaming​

Advanced Configuration​

Variable-dimension embeddings (Matryoshka)​

Separate index and query targets​

PII-safe embedding pipeline​

Best Practices​

For AI systems​

For engineers​

For leaders​

Next steps​