LangChain + Keeptrusts: Governed RAG Pipelines

LangChain's ChatOpenAI class accepts a custom openai_api_base, which means every chain, agent, and RAG pipeline can route through the Keeptrusts gateway with a one-line change.

Use this page when

You are routing LangChain chains, agents, or RAG pipelines through the Keeptrusts gateway.
You need to configure ChatOpenAI with a custom openai_api_base for policy enforcement.
You want to apply tool-level policies to LangChain agent tool calls.
You are handling 409 policy blocks within LangChain chain invocations.

Primary audience

Primary: Python developers building LangChain RAG or agent applications
Secondary: AI Engineers validating governance on multi-step reasoning chains, MLOps Engineers testing pipelines

Basic Client Configuration

Python

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o",
    openai_api_base="http://localhost:41002/v1",
    openai_api_key="sk-...",
    temperature=0,
)

response = llm.invoke("What is AI governance?")
print(response.content)

Environment-Driven Setup

import os
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model=os.environ.get("LLM_MODEL", "gpt-4o"),
    openai_api_base=os.environ.get("LLM_GATEWAY_URL", "http://localhost:41002/v1"),
    openai_api_key=os.environ["OPENAI_API_KEY"],
    temperature=0,
    request_timeout=30,
)

Governed RAG Pipeline

A standard retrieval-augmented generation pipeline flows through the gateway. Every call — the embedding lookup and the generation step — is policy-evaluated.

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Embeddings also route through the gateway
embeddings = OpenAIEmbeddings(
    openai_api_base="http://localhost:41002/v1",
    openai_api_key="sk-...",
)

# Build a vector store from documents
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
docs = splitter.create_documents(["Your compliance policy documents here..."])
vectorstore = FAISS.from_documents(docs, embeddings)

# LLM routed through the governed gateway
llm = ChatOpenAI(
    model="gpt-4o",
    openai_api_base="http://localhost:41002/v1",
    openai_api_key="sk-...",
    temperature=0,
)

# Build the RAG chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
)

result = qa_chain.invoke({"query": "What are the data retention requirements?"})
print(result["result"])

Every generation call in this chain passes through the gateway's policy engine — PII filters, prompt injection detection, and content policies all apply.

Streaming with LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o",
    openai_api_base="http://localhost:41002/v1",
    openai_api_key="sk-...",
    streaming=True,
)

for chunk in llm.stream("Explain the principle of least privilege for AI agents."):
    print(chunk.content, end="", flush=True)
print()

The gateway evaluates output policies as the stream assembles. If a policy triggers, the stream terminates with a policy error.

Agent Tool Governance

LangChain agents that call tools send each tool invocation through the LLM. The gateway sees these as standard chat completions with function calls and can enforce tool-level policies.

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.tools import tool
from langchain_core.prompts import ChatPromptTemplate

@tool
def lookup_employee(employee_id: str) -> str:
    """Look up employee details by ID."""
    return f"Employee {employee_id}: Jane Doe, Engineering, Level 5"

@tool
def calculate_bonus(base_salary: float, performance_score: float) -> str:
    """Calculate employee bonus based on salary and performance."""
    bonus = base_salary * performance_score * 0.1
    return f"Calculated bonus: ${bonus:,.2f}"

llm = ChatOpenAI(
    model="gpt-4o",
    openai_api_base="http://localhost:41002/v1",
    openai_api_key="sk-...",
    temperature=0,
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an HR assistant. Use tools to answer questions."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_openai_tools_agent(llm, [lookup_employee, calculate_bonus], prompt)
executor = AgentExecutor(agent=agent, tools=[lookup_employee, calculate_bonus], verbose=True)

result = executor.invoke({"input": "What bonus does employee EMP-42 get with a $120k salary and 0.9 score?"})
print(result["output"])

Policy Config for Tool Governance

policies:
  - name: restrict-tools
    type: tool_filter
    action: block
    blocked_tools:
      - "delete_employee"
      - "modify_salary"
    message: "Tool call blocked: restricted HR operation"

  - name: log-tool-calls
    type: observe
    action: log
    match: tool_call

This blocks dangerous tool invocations at the gateway level — the agent never executes them.

Handling Policy Blocks in Chains

Wrap chain execution to catch 409 policy blocks:

from openai import APIStatusError

def safe_chain_invoke(chain, query: str) -> str:
    try:
        result = chain.invoke({"query": query})
        return result["result"]
    except APIStatusError as e:
        if e.status_code == 409:
            error_body = e.response.json()
            policy = error_body.get("error", {}).get("policy", "unknown")
            return f"[Blocked by policy: {policy}]"
        raise

LCEL (LangChain Expression Language) Chains

Modern LangChain uses LCEL for composable chains:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(
    model="gpt-4o",
    openai_api_base="http://localhost:41002/v1",
    openai_api_key="sk-...",
)

prompt = ChatPromptTemplate.from_template(
    "Summarize this document in three bullet points:\n\n{document}"
)

chain = prompt | llm | StrOutputParser()

summary = chain.invoke({"document": "Your long compliance document text here..."})
print(summary)

Every invoke, stream, or batch call flows through the gateway. No special LangChain configuration needed beyond the base URL.

Best Practices

Set openai_api_base via environment variable — same code across environments.
Use request_timeout — gateway policy evaluation adds minimal latency, but set a timeout.
Catch APIStatusError around chain invocations — policy blocks propagate as 409 errors.
Apply tool-level policies for agents — block dangerous tool calls at the gateway, not in application code.
Log verbose agent runs during development — correlate LangChain traces with gateway decision events.
Test with observe-only policies first — validate your RAG pipeline before enabling blocking policies.

Next steps

LlamaIndex Integration — governed document AI pipelines
Function Calling — deep dive into tool validation and budget limits
Error Handling — complete error envelope reference

For AI systems

Canonical terms: LangChain, ChatOpenAI, openai_api_base, RAG pipeline, RetrievalQA, agent, tool call, APIStatusError, policy block (409).
Key config: ChatOpenAI(openai_api_base="http://localhost:41002/v1"). Embeddings: OpenAIEmbeddings(openai_api_base=...).
Both LLM completions and embedding calls pass through the gateway policy chain.
Best next pages: LlamaIndex Integration, Function Calling, Error Handling.

For engineers

Set openai_api_base via environment variable so the same code routes through different gateways per environment.
Every invoke, stream, or batch call flows through the gateway — no special LangChain configuration beyond the base URL.
Route embeddings through the gateway too (OpenAIEmbeddings(openai_api_base=...)) for full observability.
Catch APIStatusError around chain invocations to handle 409 policy blocks gracefully.
Apply tool-level policies at the gateway for agent workflows — block dangerous tool calls without modifying application code.
Use request_timeout parameter to set timeouts; policy evaluation adds minimal latency.

For leaders

LangChain is one of the most popular AI frameworks — governed RAG requires only a one-line URL change.
Gateway-level tool policies protect against agent misuse without requiring application-level guardrails.
All chain invocations produce decision events, providing auditability for multi-step reasoning workflows.
Test with observe-only policies first to understand pipeline behavior before enabling blocking enforcement.

Use this page when​

Primary audience​

Basic Client Configuration​

Python​

Environment-Driven Setup​

Governed RAG Pipeline​

Streaming with LangChain​

Agent Tool Governance​

Policy Config for Tool Governance​

Handling Policy Blocks in Chains​

LCEL (LangChain Expression Language) Chains​

Best Practices​

Next steps​

For AI systems​

For engineers​

For leaders​