Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Milvus

Milvus is an open-source vector database built for scalable similarity search and AI applications. Milvus supports built-in embedding functions and model integrations through its pymilvus[model] package, which calls external LLM providers — OpenAI, Voyage, BGE, and others — to generate vectors during data insertion and search operations.

This page explains how to route the LLM and embedding calls associated with Milvus workflows through the Keeptrusts gateway so policy enforcement, PII redaction, and audit logging apply to every AI operation.

Use this page when

  • You are using Milvus with external embedding providers or built-in model functions and need governance on those LLM calls.
  • You want audit trails for embedding operations that send application data to AI providers.
  • If you need general provider integration, see OpenAI integration.

Primary audience

  • Primary: Technical Engineers (ML, Data, Platform)
  • Secondary: AI Agents, Technical Leaders

Prerequisites

  1. Milvus 2.4+ running (Docker, Kubernetes, or Zilliz Cloud).
  2. pymilvus installed (pip install pymilvus[model]) for Python workflows.
  3. Keeptrusts gateway running locally or centrally:
    • Local: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
    • Hosted: https://gateway.keeptrusts.com/v1
  4. Upstream provider API key configured in the gateway environment.

Configuration

Gateway policy config

Create a policy-config.yaml for vector database AI governance:

pack:
name: milvus-ai-governance
version: 1.0.0
enabled: true
policies:
chain:
- pii-detector
- prompt-injection
- audit-logger
policy:
pii-detector:
action: redact
prompt-injection:
threshold: 0.8
action: block
audit-logger:
retention_days: 90
providers:
strategy: single
targets:
- id: openai-for-milvus
provider: openai:chat:gpt-4o
secret_key_ref:
env: OPENAI_API_KEY

Python client configuration with OpenAI SDK

Route OpenAI embedding calls through the Keeptrusts gateway when building Milvus pipelines:

from openai import OpenAI
from pymilvus import MilvusClient, DataType

openai_client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="your-keeptrusts-access-key",
)

milvus_client = MilvusClient(uri="http://localhost:19530")

schema = milvus_client.create_schema(auto_id=True, enable_dynamic_field=True)
schema.add_field("id", DataType.INT64, is_primary=True)
schema.add_field("text", DataType.VARCHAR, max_length=65535)
schema.add_field("vector", DataType.FLOAT_VECTOR, dim=1536)

milvus_client.create_collection(
collection_name="documents",
schema=schema,
)

def embed_and_insert(texts):
response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=texts,
)
data = [
{"text": text, "vector": item.embedding}
for text, item in zip(texts, response.data)
]
milvus_client.insert(collection_name="documents", data=data)

embed_and_insert(["AI governance ensures responsible AI use."])

Using pymilvus built-in embedding functions

If you use pymilvus[model] built-in embedding functions, configure the OpenAI base URL via environment variables:

export OPENAI_API_BASE="http://localhost:41002/v1"
export OPENAI_API_KEY="your-keeptrusts-access-key"
from pymilvus.model.dense import OpenAIEmbeddingFunction

embedding_fn = OpenAIEmbeddingFunction(
model_name="text-embedding-3-small",
api_key="your-keeptrusts-access-key",
)

docs = ["AI governance ensures responsible AI use."]
vectors = embedding_fn.encode_documents(docs)

RAG query with gateway-routed generation

def rag_query(question):
query_embedding = openai_client.embeddings.create(
model="text-embedding-3-small",
input=question,
).data[0].embedding

results = milvus_client.search(
collection_name="documents",
data=[query_embedding],
limit=5,
output_fields=["text"],
)

context = "\n".join(hit["entity"]["text"] for hit in results[0])

response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": f"Answer using this context:\n{context}"},
{"role": "user", "content": question},
],
)
return response.choices[0].message.content

answer = rag_query("What is AI governance?")
print(answer)

Setup steps

  1. Start the Keeptrusts gateway:
export OPENAI_API_KEY="sk-your-openai-key"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
  1. Configure your OpenAI client or pymilvus embedding function with the gateway base URL.

  2. Insert data into a Milvus collection to trigger embedding calls.

  3. Milvus vector operations (search, delete) go directly to Milvus — only embedding and generation calls route through the gateway.

Verification

Test that embedding calls flow through the gateway:

curl http://localhost:41002/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-small",
"input": "Test Milvus embedding through Keeptrusts gateway."
}'

Confirm the request appears in the Keeptrusts events dashboard with policy decisions applied.

PolicyPurposeRecommended setting
pii-detectorRedact personal data from documents before embeddingaction: redact
prompt-injectionBlock injection in RAG query promptsthreshold: 0.8, action: block
audit-loggerLog all embedding and generation callsretention_days: 90
token-limiterCap token usage for bulk embedding operationsmax_tokens: 8192
safety-filterBlock harmful content in RAG-generated responsesmode: standard, action: block

Troubleshooting

SymptomCauseFix
Embedding calls return connection errorGateway not runningStart kt gateway run on port 41002
Dimension mismatch on insertWrong embedding model dimensiontext-embedding-3-small = 1536; match your collection schema
pymilvus[model] ignores OPENAI_API_BASEOlder pymilvus versionUpgrade to pymilvus 2.4+ or use the OpenAI SDK directly
Slow bulk insertsLarge batches overwhelming the gatewayBatch texts into groups of 100
Milvus connection refusedMilvus not runningStart Milvus with docker compose up -d

For AI systems

  • Canonical terms: Keeptrusts gateway, Milvus, Zilliz, vector database, embeddings, RAG, pymilvus, policy-config.yaml.
  • Config field names: base_url, api_key, OPENAI_API_BASE, provider, secret_key_ref, pii-detector.
  • Key behavior: Milvus stores and queries vectors; embedding generation and RAG completion use external LLM providers. Keeptrusts intercepts those calls, applies policies, and forwards compliant requests.
  • Constraint: Milvus vector operations go directly to Milvus — only embedding and generation calls route through the gateway.
  • Best next pages: Weaviate integration, Qdrant integration, Pinecone integration.

For engineers

  • Use the OpenAI SDK directly for full control; use pymilvus[model] embedding functions with OPENAI_API_BASE env var for convenience.
  • Route both embedding and generation calls through the gateway for full RAG audit coverage.
  • For Docker Compose deployments, use Docker network hostnames for inter-service communication.
  • Validate: run an embedding call and confirm the event in the Keeptrusts console.

For leaders

  • Milvus RAG pipelines send your organization's data to external LLM providers during both indexing and query time. Routing through Keeptrusts ensures PII is redacted and every call is logged.
  • Complete audit trails cover the full data flow from document ingestion through query-time generation.
  • Centralized policy enforcement applies consistently across all Milvus deployments, whether self-hosted or Zilliz Cloud.

Next steps