LlamaIndex with Keeptrusts Gateway

LlamaIndex is a data framework for building LLM-powered applications over your data — retrieval-augmented generation (RAG), structured data extraction, and autonomous agents. By pointing LlamaIndex's LLM and embedding calls at the Keeptrusts gateway, every interaction with an upstream provider passes through your policy chain. This gives you PII redaction, prompt-injection detection, audit logging, cost attribution, and content filtering across all LlamaIndex queries, agents, and index operations.

Use this page when

You are building a LlamaIndex RAG pipeline and need policy enforcement on every LLM call.
You want audit logging and cost attribution for LlamaIndex query engines and agents.
You need to enforce compliance controls on data extraction or summarization workflows.
You are moving a LlamaIndex prototype to production and need governance guardrails.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Prerequisites

Keeptrusts CLI installed and a gateway running locally or centrally (Quickstart).
Python 3.10+ with llama-index installed.
Upstream provider API key exported as an environment variable (e.g. OPENAI_API_KEY).
A policy-config.yaml deployed to the gateway with at least one policy in the chain.

Configuration

Gateway policy config

A minimal policy config for LlamaIndex traffic through an OpenAI provider:

pack:
  name: llamaindex-gateway
  version: "1.0"

providers:
  - name: openai
    model: gpt-4o
    secret_key_ref:
      env: OPENAI_API_KEY

policies:
  chain:
    - prompt-injection
    - pii-detector
    - quality-scorer

policy:
  prompt-injection:
    action: block
  pii-detector:
    action: redact
  quality-scorer:
    threshold: 0.6

Start the gateway:

kt gateway run --policy-config policy-config.yaml

LlamaIndex client configuration

Point the LLM's api_base at the Keeptrusts gateway. LlamaIndex uses the OpenAI SDK under the hood, so the configuration follows the same pattern.

OpenAI LLM
Chat interface
Hosted gateway

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-4o",
    api_base="http://localhost:41002/v1",
    api_key="your-openai-api-key",
)

response = llm.complete("What are the key risks in our Q3 financial report?")
print(response.text)

from llama_index.llms.openai import OpenAI
from llama_index.core.llms import ChatMessage

llm = OpenAI(
    model="gpt-4o",
    api_base="http://localhost:41002/v1",
    api_key="your-openai-api-key",
)

messages = [
    ChatMessage(role="system", content="You are a compliance analyst."),
    ChatMessage(role="user", content="Summarize GDPR Article 17."),
]
response = llm.chat(messages)
print(response.message.content)

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-4o",
    api_base="https://gateway.keeptrusts.com/v1",
    api_key="your-openai-api-key",
)

Using with a query engine

Once the LLM is configured, pass it to any LlamaIndex component. The gateway intercepts all calls transparently:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings

Settings.llm = llm

documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

response = query_engine.query("What are our data retention obligations?")
print(response)

Setup steps

Install dependencies

pip install llama-index llama-index-llms-openai

Export your provider API key
```
export OPENAI_API_KEY="sk-..."
```

Start the Keeptrusts gateway

kt gateway run --policy-config policy-config.yaml

Set api_base on the LLM constructor as shown in the Configuration section above.
Run your query engine or agent — all LLM calls now route through the gateway.
Verify in the Keeptrusts console — open Events to confirm requests appear with policy outcomes.

Verification

Check gateway health:

curl http://localhost:41002/keeptrusts/health

Run a test query through your LlamaIndex application and confirm:

The gateway logs show policy chain evaluation for each request.
The Keeptrusts console Events page displays the request with model, token count, cost, and policy decisions.
If pii-detector is active, any PII in the prompt is redacted before reaching the provider.

Recommended policies

Policy	Purpose	Phase
`prompt-injection`	Block jailbreak attempts from user queries or retrieved context	Input
`pii-detector`	Redact PII in prompts and retrieved documents before they reach the provider	Input
`dlp-filter`	Prevent sensitive data from leaving the organization via LLM calls	Input
`safety-filter`	Block harmful content in queries or generated responses	Input
`quality-scorer`	Score and threshold response quality for RAG accuracy	Output
`citation-verifier`	Verify that responses are grounded in provided context	Output
`audit-logger`	Attach audit metadata for every query interaction	Input

Troubleshooting

Symptom	Cause	Fix
`ConnectionError` on LLM calls	Gateway is not running	Run `kt gateway run --policy-config policy-config.yaml`
`401 Unauthorized`	API key mismatch	Verify `OPENAI_API_KEY` is exported and matches `secret_key_ref.env` in the gateway config
Embedding calls bypass the gateway	Embedding model not configured to use the gateway	Set `api_base` on the embedding model as well: `OpenAIEmbedding(api_base="http://localhost:41002/v1")`
Events missing in the console	Gateway not connected to control plane	Set `KEEPTRUSTS_API_URL` and `KEEPTRUSTS_GATEWAY_TOKEN` before starting the gateway
Slow query engine responses	Policy chain adds latency	Profile with `kt events tail` and reduce chain length for latency-sensitive paths

For AI systems

Canonical integration: LlamaIndex OpenAI LLM with api_base set to http://localhost:41002/v1 or https://gateway.keeptrusts.com/v1.
The gateway is transparent — query engines, agents, routers, and tools require no changes beyond the LLM base URL.
Use Policy Controls Catalog for available policies.

For engineers

Set api_base once on the LLM and optionally on the embedding model. All LlamaIndex pipelines that use those components inherit gateway routing.
Use Settings.llm to apply the configuration globally across all LlamaIndex components.
Test locally with kt gateway run, then switch the URL for staging and production.

For leaders

RAG pipelines often process sensitive internal documents. Routing through Keeptrusts ensures PII redaction and audit logging before any data reaches the provider.
Cost attribution at the gateway provides per-pipeline and per-team spend visibility.
Centralized policy enforcement means compliance changes apply to all LlamaIndex applications without code deployments.

Next steps

Quickstart — set up your first gateway and policy config.
Policy Controls Catalog — full inventory of available policies.
Events and Traces — understand the audit trail.
Gateway Runtime Features — advanced gateway capabilities.
Knowledge Base — manage knowledge assets for RAG with governance.

Use this page when​

Primary audience​

Prerequisites​

Configuration​

Gateway policy config​

LlamaIndex client configuration​

Using with a query engine​

Setup steps​

Verification​

Recommended policies​

Troubleshooting​

For AI systems​

For engineers​

For leaders​

Next steps​