Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

LlamaIndex with Keeptrusts Gateway

LlamaIndex is a data framework for building LLM-powered applications over your data — retrieval-augmented generation (RAG), structured data extraction, and autonomous agents. By pointing LlamaIndex's LLM and embedding calls at the Keeptrusts gateway, every interaction with an upstream provider passes through your policy chain. This gives you PII redaction, prompt-injection detection, audit logging, cost attribution, and content filtering across all LlamaIndex queries, agents, and index operations.

Use this page when

  • You are building a LlamaIndex RAG pipeline and need policy enforcement on every LLM call.
  • You want audit logging and cost attribution for LlamaIndex query engines and agents.
  • You need to enforce compliance controls on data extraction or summarization workflows.
  • You are moving a LlamaIndex prototype to production and need governance guardrails.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Prerequisites

  • Keeptrusts CLI installed and a gateway running locally or centrally (Quickstart).
  • Python 3.10+ with llama-index installed.
  • Upstream provider API key exported as an environment variable (e.g. OPENAI_API_KEY).
  • A policy-config.yaml deployed to the gateway with at least one policy in the chain.

Configuration

Gateway policy config

A minimal policy config for LlamaIndex traffic through an OpenAI provider:

pack:
name: llamaindex-gateway
version: "1.0"

providers:
- name: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY

policies:
chain:
- prompt-injection
- pii-detector
- quality-scorer

policy:
prompt-injection:
action: block
pii-detector:
action: redact
quality-scorer:
threshold: 0.6

Start the gateway:

kt gateway run --policy-config policy-config.yaml

LlamaIndex client configuration

Point the LLM's api_base at the Keeptrusts gateway. LlamaIndex uses the OpenAI SDK under the hood, so the configuration follows the same pattern.

from llama_index.llms.openai import OpenAI

llm = OpenAI(
model="gpt-4o",
api_base="http://localhost:41002/v1",
api_key="your-openai-api-key",
)

response = llm.complete("What are the key risks in our Q3 financial report?")
print(response.text)

Using with a query engine

Once the LLM is configured, pass it to any LlamaIndex component. The gateway intercepts all calls transparently:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings

Settings.llm = llm

documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

response = query_engine.query("What are our data retention obligations?")
print(response)

Setup steps

  1. Install dependencies

    pip install llama-index llama-index-llms-openai
  2. Export your provider API key

    export OPENAI_API_KEY="sk-..."
  3. Start the Keeptrusts gateway

    kt gateway run --policy-config policy-config.yaml
  4. Set api_base on the LLM constructor as shown in the Configuration section above.

  5. Run your query engine or agent — all LLM calls now route through the gateway.

  6. Verify in the Keeptrusts console — open Events to confirm requests appear with policy outcomes.

Verification

Check gateway health:

curl http://localhost:41002/keeptrusts/health

Run a test query through your LlamaIndex application and confirm:

  • The gateway logs show policy chain evaluation for each request.
  • The Keeptrusts console Events page displays the request with model, token count, cost, and policy decisions.
  • If pii-detector is active, any PII in the prompt is redacted before reaching the provider.
PolicyPurposePhase
prompt-injectionBlock jailbreak attempts from user queries or retrieved contextInput
pii-detectorRedact PII in prompts and retrieved documents before they reach the providerInput
dlp-filterPrevent sensitive data from leaving the organization via LLM callsInput
safety-filterBlock harmful content in queries or generated responsesInput
quality-scorerScore and threshold response quality for RAG accuracyOutput
citation-verifierVerify that responses are grounded in provided contextOutput
audit-loggerAttach audit metadata for every query interactionInput

Troubleshooting

SymptomCauseFix
ConnectionError on LLM callsGateway is not runningRun kt gateway run --policy-config policy-config.yaml
401 UnauthorizedAPI key mismatchVerify OPENAI_API_KEY is exported and matches secret_key_ref.env in the gateway config
Embedding calls bypass the gatewayEmbedding model not configured to use the gatewaySet api_base on the embedding model as well: OpenAIEmbedding(api_base="http://localhost:41002/v1")
Events missing in the consoleGateway not connected to control planeSet KEEPTRUSTS_API_URL and KEEPTRUSTS_GATEWAY_TOKEN before starting the gateway
Slow query engine responsesPolicy chain adds latencyProfile with kt events tail and reduce chain length for latency-sensitive paths

For AI systems

  • Canonical integration: LlamaIndex OpenAI LLM with api_base set to http://localhost:41002/v1 or https://gateway.keeptrusts.com/v1.
  • The gateway is transparent — query engines, agents, routers, and tools require no changes beyond the LLM base URL.
  • Use Policy Controls Catalog for available policies.

For engineers

  • Set api_base once on the LLM and optionally on the embedding model. All LlamaIndex pipelines that use those components inherit gateway routing.
  • Use Settings.llm to apply the configuration globally across all LlamaIndex components.
  • Test locally with kt gateway run, then switch the URL for staging and production.

For leaders

  • RAG pipelines often process sensitive internal documents. Routing through Keeptrusts ensures PII redaction and audit logging before any data reaches the provider.
  • Cost attribution at the gateway provides per-pipeline and per-team spend visibility.
  • Centralized policy enforcement means compliance changes apply to all LlamaIndex applications without code deployments.

Next steps