MLflow
Keeptrusts integrates with MLflow by routing inference requests to MLflow-served models through the Keeptrusts gateway. When you deploy an LLM or chat model via MLflow Model Serving — whether on Databricks, a self-hosted MLflow server, or a custom deployment — you configure the client to point at the Keeptrusts gateway instead of the MLflow endpoint directly. The gateway enforces policies, redacts PII, and logs every inference call before forwarding to the MLflow serving endpoint.
Use this page when
- You are routing MLflow model serving requests through Keeptrusts for governance.
- You need the gateway config for MLflow-deployed models with OpenAI-compatible endpoints.
- You want audit logging and policy enforcement on production ML model inference.
- If you want a general quickstart instead, see Quickstart.
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Prerequisites
- An MLflow deployment with MLflow Model Serving running (local, Databricks, or cloud)
- A served model endpoint with an OpenAI-compatible chat interface
- Keeptrusts CLI (
kt) installed and authenticated (kt auth login) - Network connectivity from the Keeptrusts gateway to the MLflow serving endpoint
Configuration
Gateway policy config — MLflow as upstream provider
pack:
name: mlflow-governance
version: 1.0.0
enabled: true
providers:
targets:
- id: mlflow-chat
provider: openai:chat:my-mlflow-model
base_url: http://mlflow-server:5000/v1
secret_key_ref:
env: MLFLOW_TRACKING_TOKEN
policies:
chain:
- prompt-injection
- pii-detector
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
entities:
- PERSON
- EMAIL_ADDRESS
- PHONE_NUMBER
- SSN
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Gateway config for Databricks-hosted MLflow
pack:
name: mlflow-databricks-governance
version: 1.0.0
enabled: true
providers:
targets:
- id: mlflow-databricks
provider: openai:chat:my-registered-model
base_url: https://{workspace}.databricks.net/serving-endpoints
secret_key_ref:
env: DATABRICKS_TOKEN
policies:
chain:
- prompt-injection
- pii-detector
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
entities:
- PERSON
- EMAIL_ADDRESS
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Start the gateway
export MLFLOW_TRACKING_TOKEN="your-token"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
Setup steps
1. Deploy a model with MLflow Model Serving
mlflow models serve \
--model-uri models:/my-chat-model/Production \
--port 5000 \
--host 0.0.0.0
Ensure the model exposes an OpenAI-compatible /v1/chat/completions endpoint. MLflow supports this natively for ChatModel flavors.
2. Point the Keeptrusts gateway at the MLflow endpoint
In your policy config, set base_url to the MLflow serving endpoint:
providers:
targets:
- id: mlflow-chat
provider: openai:chat:my-chat-model
base_url: http://mlflow-server:5000/v1
secret_key_ref:
env: MLFLOW_TRACKING_TOKEN
3. Point clients at the Keeptrusts gateway
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="unused",
)
response = client.chat.completions.create(
model="my-chat-model",
messages=[{"role": "user", "content": "Summarize the quarterly report."}],
max_tokens=512,
)
print(response.choices[0].message.content)
4. Log governance events back to MLflow (optional)
import mlflow
mlflow.set_tracking_uri("http://mlflow-server:5000")
with mlflow.start_run(run_name="governed-inference"):
mlflow.log_param("gateway", "keeptrusts")
mlflow.log_param("policies", "prompt-injection,pii-detector,audit-logger")
mlflow.log_metric("requests_total", request_count)
mlflow.log_metric("requests_blocked", blocked_count)
Verification
# Verify MLflow serving is healthy
curl http://mlflow-server:5000/health
# Verify the Keeptrusts gateway is healthy
curl http://localhost:41002/health
# Send a test request through the gateway
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "my-chat-model",
"messages": [{"role": "user", "content": "Hello, world."}],
"max_tokens": 64
}' | jq .choices[0].message.content
# Check audit log
kt events list --limit 5
Recommended policies
| Policy | Purpose | Recommended setting |
|---|---|---|
pii-detector | Redact personal data before inference | action: redact, entities: PERSON, EMAIL, SSN |
prompt-injection | Block adversarial prompts targeting served models | threshold: 0.8, action: block |
audit-logger | Full audit trail for model inference in production | retention_days: 365, immutable: true |
rbac | Restrict which teams or services can call which models | Map service identities to Keeptrusts roles |
dlp-filter | Prevent model artifacts or credentials in prompts | Block MLflow tracking URIs and tokens |
Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Gateway returns 502 | MLflow serving endpoint unreachable | Verify base_url and network connectivity to MLflow |
| Model not found (404) | Model name mismatch | Ensure provider model name matches the MLflow registered model name |
| Auth failure (401) | Token missing or expired | Set MLFLOW_TRACKING_TOKEN and restart the gateway |
| Slow inference | Policy chain adds latency | Profile each policy; consider removing pii-detector for non-PII workloads |
| MLflow UI shows no governance data | Not logging back to MLflow | Add the optional MLflow tracking step (step 4) |
For AI systems
- Canonical terms: Keeptrusts gateway, MLflow, MLflow Model Serving, registered model, model endpoint, OpenAI-compatible,
policy-config.yaml. - Config field names:
provider,base_url,secret_key_ref.env,pii-detector,audit-logger. - Key behavior: Keeptrusts acts as a reverse proxy in front of MLflow Model Serving, enforcing policies on inference requests before they reach the model.
- Best next pages: Databricks integration, W&B integration, Policy controls catalog.
For engineers
Prerequisites
- MLflow with Model Serving running (local or Databricks), OpenAI-compatible endpoint,
ktCLI installed.
Validation
- Send a test inference through the gateway and verify the response matches direct MLflow output.
- Run
kt events list --limit 5and confirm the request was logged with policy decisions.
For leaders
- MLflow is the de facto standard for model lifecycle management. Adding Keeptrusts governance at the serving layer ensures that every production inference call is audited and policy-checked without modifying the model or the MLflow deployment.
- Audit trails satisfy SOC 2 and internal AI governance requirements for documenting model inference in production.
- Role-based access at the gateway prevents unauthorized services from calling expensive or sensitive models.
Next steps
- Databricks integration — Databricks-hosted MLflow with Foundation Models
- W&B integration — experiment tracking alongside governance
- Policy controls catalog — full reference for all policy types
- Quickstart — install
ktand run your first gateway