Milvus

Milvus is an open-source vector database built for scalable similarity search and AI applications. Milvus supports built-in embedding functions and model integrations through its pymilvus[model] package, which calls external LLM providers — OpenAI, Voyage, BGE, and others — to generate vectors during data insertion and search operations.

This page explains how to route the LLM and embedding calls associated with Milvus workflows through the Keeptrusts gateway so policy enforcement, PII redaction, and audit logging apply to every AI operation.

Use this page when

You are using Milvus with external embedding providers or built-in model functions and need governance on those LLM calls.
You want audit trails for embedding operations that send application data to AI providers.
If you need general provider integration, see OpenAI integration.

Primary audience

Primary: Technical Engineers (ML, Data, Platform)
Secondary: AI Agents, Technical Leaders

Prerequisites

Milvus 2.4+ running (Docker, Kubernetes, or Zilliz Cloud).
pymilvus installed (pip install pymilvus[model]) for Python workflows.
Keeptrusts gateway running locally or centrally:
- Local: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
- Hosted: https://gateway.keeptrusts.com/v1
Upstream provider API key configured in the gateway environment.

Configuration

Gateway policy config

Create a policy-config.yaml for vector database AI governance:

pack:
  name: milvus-ai-governance
  version: 1.0.0
  enabled: true
policies:
  chain:
  - pii-detector
  - prompt-injection
  - audit-logger
policy:
  pii-detector:
    action: redact
  prompt-injection:
    threshold: 0.8
    action: block
  audit-logger:
    retention_days: 90
providers:
  strategy: single
  targets:
  - id: openai-for-milvus
    provider: openai:chat:gpt-4o
    secret_key_ref:
      env: OPENAI_API_KEY

Python client configuration with OpenAI SDK

Route OpenAI embedding calls through the Keeptrusts gateway when building Milvus pipelines:

from openai import OpenAI
from pymilvus import MilvusClient, DataType

openai_client = OpenAI(
    base_url="http://localhost:41002/v1",
    api_key="your-keeptrusts-access-key",
)

milvus_client = MilvusClient(uri="http://localhost:19530")

schema = milvus_client.create_schema(auto_id=True, enable_dynamic_field=True)
schema.add_field("id", DataType.INT64, is_primary=True)
schema.add_field("text", DataType.VARCHAR, max_length=65535)
schema.add_field("vector", DataType.FLOAT_VECTOR, dim=1536)

milvus_client.create_collection(
    collection_name="documents",
    schema=schema,
)

def embed_and_insert(texts):
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=texts,
    )
    data = [
        {"text": text, "vector": item.embedding}
        for text, item in zip(texts, response.data)
    ]
    milvus_client.insert(collection_name="documents", data=data)

embed_and_insert(["AI governance ensures responsible AI use."])

Using pymilvus built-in embedding functions

If you use pymilvus[model] built-in embedding functions, configure the OpenAI base URL via environment variables:

export OPENAI_API_BASE="http://localhost:41002/v1"
export OPENAI_API_KEY="your-keeptrusts-access-key"

from pymilvus.model.dense import OpenAIEmbeddingFunction

embedding_fn = OpenAIEmbeddingFunction(
    model_name="text-embedding-3-small",
    api_key="your-keeptrusts-access-key",
)

docs = ["AI governance ensures responsible AI use."]
vectors = embedding_fn.encode_documents(docs)

RAG query with gateway-routed generation

def rag_query(question):
    query_embedding = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=question,
    ).data[0].embedding

    results = milvus_client.search(
        collection_name="documents",
        data=[query_embedding],
        limit=5,
        output_fields=["text"],
    )

    context = "\n".join(hit["entity"]["text"] for hit in results[0])

    response = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"Answer using this context:\n{context}"},
            {"role": "user", "content": question},
        ],
    )
    return response.choices[0].message.content

answer = rag_query("What is AI governance?")
print(answer)

Setup steps

Start the Keeptrusts gateway:

export OPENAI_API_KEY="sk-your-openai-key"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml

Configure your OpenAI client or pymilvus embedding function with the gateway base URL.
Insert data into a Milvus collection to trigger embedding calls.
Milvus vector operations (search, delete) go directly to Milvus — only embedding and generation calls route through the gateway.

Verification

Test that embedding calls flow through the gateway:

curl http://localhost:41002/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Test Milvus embedding through Keeptrusts gateway."
  }'

Confirm the request appears in the Keeptrusts events dashboard with policy decisions applied.

Recommended policies

Policy	Purpose	Recommended setting
`pii-detector`	Redact personal data from documents before embedding	`action: redact`
`prompt-injection`	Block injection in RAG query prompts	`threshold: 0.8`, `action: block`
`audit-logger`	Log all embedding and generation calls	`retention_days: 90`
`token-limiter`	Cap token usage for bulk embedding operations	`max_tokens: 8192`
`safety-filter`	Block harmful content in RAG-generated responses	`mode: standard`, `action: block`

Troubleshooting

Symptom	Cause	Fix
Embedding calls return connection error	Gateway not running	Start `kt gateway run` on port 41002
Dimension mismatch on insert	Wrong embedding model dimension	`text-embedding-3-small` = 1536; match your collection schema
`pymilvus[model]` ignores `OPENAI_API_BASE`	Older pymilvus version	Upgrade to pymilvus 2.4+ or use the OpenAI SDK directly
Slow bulk inserts	Large batches overwhelming the gateway	Batch texts into groups of 100
Milvus connection refused	Milvus not running	Start Milvus with `docker compose up -d`

For AI systems

Canonical terms: Keeptrusts gateway, Milvus, Zilliz, vector database, embeddings, RAG, pymilvus, policy-config.yaml.
Config field names: base_url, api_key, OPENAI_API_BASE, provider, secret_key_ref, pii-detector.
Key behavior: Milvus stores and queries vectors; embedding generation and RAG completion use external LLM providers. Keeptrusts intercepts those calls, applies policies, and forwards compliant requests.
Constraint: Milvus vector operations go directly to Milvus — only embedding and generation calls route through the gateway.
Best next pages: Weaviate integration, Qdrant integration, Pinecone integration.

For engineers

Use the OpenAI SDK directly for full control; use pymilvus[model] embedding functions with OPENAI_API_BASE env var for convenience.
Route both embedding and generation calls through the gateway for full RAG audit coverage.
For Docker Compose deployments, use Docker network hostnames for inter-service communication.
Validate: run an embedding call and confirm the event in the Keeptrusts console.

For leaders

Milvus RAG pipelines send your organization's data to external LLM providers during both indexing and query time. Routing through Keeptrusts ensures PII is redacted and every call is logged.
Complete audit trails cover the full data flow from document ingestion through query-time generation.
Centralized policy enforcement applies consistently across all Milvus deployments, whether self-hosted or Zilliz Cloud.

Next steps

Weaviate integration — govern Weaviate generative search
Qdrant integration — govern Qdrant neural search
ChromaDB integration — govern ChromaDB embedding calls
Pinecone integration — govern Pinecone inference calls
Policy controls catalog — full policy reference

Use this page when​

Primary audience​

Prerequisites​

Configuration​

Gateway policy config​

Python client configuration with OpenAI SDK​

Using pymilvus built-in embedding functions​

RAG query with gateway-routed generation​

Setup steps​

Verification​

Recommended policies​

Troubleshooting​

For AI systems​

For engineers​

For leaders​

Next steps​