LlamaIndex with Keeptrusts Gateway
LlamaIndex is a data framework for building LLM-powered applications over your data — retrieval-augmented generation (RAG), structured data extraction, and autonomous agents. By pointing LlamaIndex's LLM and embedding calls at the Keeptrusts gateway, every interaction with an upstream provider passes through your policy chain. This gives you PII redaction, prompt-injection detection, audit logging, cost attribution, and content filtering across all LlamaIndex queries, agents, and index operations.
Use this page when
- You are building a LlamaIndex RAG pipeline and need policy enforcement on every LLM call.
- You want audit logging and cost attribution for LlamaIndex query engines and agents.
- You need to enforce compliance controls on data extraction or summarization workflows.
- You are moving a LlamaIndex prototype to production and need governance guardrails.
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Prerequisites
- Keeptrusts CLI installed and a gateway running locally or centrally (Quickstart).
- Python 3.10+ with
llama-indexinstalled. - Upstream provider API key exported as an environment variable (e.g.
OPENAI_API_KEY). - A
policy-config.yamldeployed to the gateway with at least one policy in the chain.
Configuration
Gateway policy config
A minimal policy config for LlamaIndex traffic through an OpenAI provider:
pack:
name: llamaindex-gateway
version: "1.0"
providers:
- name: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- prompt-injection
- pii-detector
- quality-scorer
policy:
prompt-injection:
action: block
pii-detector:
action: redact
quality-scorer:
threshold: 0.6
Start the gateway:
kt gateway run --policy-config policy-config.yaml
LlamaIndex client configuration
Point the LLM's api_base at the Keeptrusts gateway. LlamaIndex uses the OpenAI SDK under the hood, so the configuration follows the same pattern.
- OpenAI LLM
- Chat interface
- Hosted gateway
from llama_index.llms.openai import OpenAI
llm = OpenAI(
model="gpt-4o",
api_base="http://localhost:41002/v1",
api_key="your-openai-api-key",
)
response = llm.complete("What are the key risks in our Q3 financial report?")
print(response.text)
from llama_index.llms.openai import OpenAI
from llama_index.core.llms import ChatMessage
llm = OpenAI(
model="gpt-4o",
api_base="http://localhost:41002/v1",
api_key="your-openai-api-key",
)
messages = [
ChatMessage(role="system", content="You are a compliance analyst."),
ChatMessage(role="user", content="Summarize GDPR Article 17."),
]
response = llm.chat(messages)
print(response.message.content)
from llama_index.llms.openai import OpenAI
llm = OpenAI(
model="gpt-4o",
api_base="https://gateway.keeptrusts.com/v1",
api_key="your-openai-api-key",
)
Using with a query engine
Once the LLM is configured, pass it to any LlamaIndex component. The gateway intercepts all calls transparently:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
Settings.llm = llm
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What are our data retention obligations?")
print(response)
Setup steps
-
Install dependencies
pip install llama-index llama-index-llms-openai -
Export your provider API key
export OPENAI_API_KEY="sk-..." -
Start the Keeptrusts gateway
kt gateway run --policy-config policy-config.yaml -
Set
api_baseon the LLM constructor as shown in the Configuration section above. -
Run your query engine or agent — all LLM calls now route through the gateway.
-
Verify in the Keeptrusts console — open Events to confirm requests appear with policy outcomes.
Verification
Check gateway health:
curl http://localhost:41002/keeptrusts/health
Run a test query through your LlamaIndex application and confirm:
- The gateway logs show policy chain evaluation for each request.
- The Keeptrusts console Events page displays the request with model, token count, cost, and policy decisions.
- If
pii-detectoris active, any PII in the prompt is redacted before reaching the provider.
Recommended policies
| Policy | Purpose | Phase |
|---|---|---|
prompt-injection | Block jailbreak attempts from user queries or retrieved context | Input |
pii-detector | Redact PII in prompts and retrieved documents before they reach the provider | Input |
dlp-filter | Prevent sensitive data from leaving the organization via LLM calls | Input |
safety-filter | Block harmful content in queries or generated responses | Input |
quality-scorer | Score and threshold response quality for RAG accuracy | Output |
citation-verifier | Verify that responses are grounded in provided context | Output |
audit-logger | Attach audit metadata for every query interaction | Input |
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
ConnectionError on LLM calls | Gateway is not running | Run kt gateway run --policy-config policy-config.yaml |
401 Unauthorized | API key mismatch | Verify OPENAI_API_KEY is exported and matches secret_key_ref.env in the gateway config |
| Embedding calls bypass the gateway | Embedding model not configured to use the gateway | Set api_base on the embedding model as well: OpenAIEmbedding(api_base="http://localhost:41002/v1") |
| Events missing in the console | Gateway not connected to control plane | Set KEEPTRUSTS_API_URL and KEEPTRUSTS_GATEWAY_TOKEN before starting the gateway |
| Slow query engine responses | Policy chain adds latency | Profile with kt events tail and reduce chain length for latency-sensitive paths |
For AI systems
- Canonical integration: LlamaIndex
OpenAILLM withapi_baseset tohttp://localhost:41002/v1orhttps://gateway.keeptrusts.com/v1. - The gateway is transparent — query engines, agents, routers, and tools require no changes beyond the LLM base URL.
- Use Policy Controls Catalog for available policies.
For engineers
- Set
api_baseonce on the LLM and optionally on the embedding model. All LlamaIndex pipelines that use those components inherit gateway routing. - Use
Settings.llmto apply the configuration globally across all LlamaIndex components. - Test locally with
kt gateway run, then switch the URL for staging and production.
For leaders
- RAG pipelines often process sensitive internal documents. Routing through Keeptrusts ensures PII redaction and audit logging before any data reaches the provider.
- Cost attribution at the gateway provides per-pipeline and per-team spend visibility.
- Centralized policy enforcement means compliance changes apply to all LlamaIndex applications without code deployments.
Next steps
- Quickstart — set up your first gateway and policy config.
- Policy Controls Catalog — full inventory of available policies.
- Events and Traces — understand the audit trail.
- Gateway Runtime Features — advanced gateway capabilities.
- Knowledge Base — manage knowledge assets for RAG with governance.