Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

MLflow

Keeptrusts integrates with MLflow by routing inference requests to MLflow-served models through the Keeptrusts gateway. When you deploy an LLM or chat model via MLflow Model Serving — whether on Databricks, a self-hosted MLflow server, or a custom deployment — you configure the client to point at the Keeptrusts gateway instead of the MLflow endpoint directly. The gateway enforces policies, redacts PII, and logs every inference call before forwarding to the MLflow serving endpoint.

Use this page when

  • You are routing MLflow model serving requests through Keeptrusts for governance.
  • You need the gateway config for MLflow-deployed models with OpenAI-compatible endpoints.
  • You want audit logging and policy enforcement on production ML model inference.
  • If you want a general quickstart instead, see Quickstart.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Prerequisites

  • An MLflow deployment with MLflow Model Serving running (local, Databricks, or cloud)
  • A served model endpoint with an OpenAI-compatible chat interface
  • Keeptrusts CLI (kt) installed and authenticated (kt auth login)
  • Network connectivity from the Keeptrusts gateway to the MLflow serving endpoint

Configuration

Gateway policy config — MLflow as upstream provider

pack:
name: mlflow-governance
version: 1.0.0
enabled: true
providers:
targets:
- id: mlflow-chat
provider: openai:chat:my-mlflow-model
base_url: http://mlflow-server:5000/v1
secret_key_ref:
env: MLFLOW_TRACKING_TOKEN
policies:
chain:
- prompt-injection
- pii-detector
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
entities:
- PERSON
- EMAIL_ADDRESS
- PHONE_NUMBER
- SSN
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Gateway config for Databricks-hosted MLflow

pack:
name: mlflow-databricks-governance
version: 1.0.0
enabled: true
providers:
targets:
- id: mlflow-databricks
provider: openai:chat:my-registered-model
base_url: https://{workspace}.databricks.net/serving-endpoints
secret_key_ref:
env: DATABRICKS_TOKEN
policies:
chain:
- prompt-injection
- pii-detector
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
entities:
- PERSON
- EMAIL_ADDRESS
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Start the gateway

export MLFLOW_TRACKING_TOKEN="your-token"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml

Setup steps

1. Deploy a model with MLflow Model Serving

mlflow models serve \
--model-uri models:/my-chat-model/Production \
--port 5000 \
--host 0.0.0.0

Ensure the model exposes an OpenAI-compatible /v1/chat/completions endpoint. MLflow supports this natively for ChatModel flavors.

2. Point the Keeptrusts gateway at the MLflow endpoint

In your policy config, set base_url to the MLflow serving endpoint:

providers:
targets:
- id: mlflow-chat
provider: openai:chat:my-chat-model
base_url: http://mlflow-server:5000/v1
secret_key_ref:
env: MLFLOW_TRACKING_TOKEN

3. Point clients at the Keeptrusts gateway

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="unused",
)

response = client.chat.completions.create(
model="my-chat-model",
messages=[{"role": "user", "content": "Summarize the quarterly report."}],
max_tokens=512,
)
print(response.choices[0].message.content)

4. Log governance events back to MLflow (optional)

import mlflow

mlflow.set_tracking_uri("http://mlflow-server:5000")
with mlflow.start_run(run_name="governed-inference"):
mlflow.log_param("gateway", "keeptrusts")
mlflow.log_param("policies", "prompt-injection,pii-detector,audit-logger")
mlflow.log_metric("requests_total", request_count)
mlflow.log_metric("requests_blocked", blocked_count)

Verification

# Verify MLflow serving is healthy
curl http://mlflow-server:5000/health

# Verify the Keeptrusts gateway is healthy
curl http://localhost:41002/health

# Send a test request through the gateway
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "my-chat-model",
"messages": [{"role": "user", "content": "Hello, world."}],
"max_tokens": 64
}' | jq .choices[0].message.content

# Check audit log
kt events list --limit 5
PolicyPurposeRecommended setting
pii-detectorRedact personal data before inferenceaction: redact, entities: PERSON, EMAIL, SSN
prompt-injectionBlock adversarial prompts targeting served modelsthreshold: 0.8, action: block
audit-loggerFull audit trail for model inference in productionretention_days: 365, immutable: true
rbacRestrict which teams or services can call which modelsMap service identities to Keeptrusts roles
dlp-filterPrevent model artifacts or credentials in promptsBlock MLflow tracking URIs and tokens

Troubleshooting

SymptomCauseFix
Gateway returns 502MLflow serving endpoint unreachableVerify base_url and network connectivity to MLflow
Model not found (404)Model name mismatchEnsure provider model name matches the MLflow registered model name
Auth failure (401)Token missing or expiredSet MLFLOW_TRACKING_TOKEN and restart the gateway
Slow inferencePolicy chain adds latencyProfile each policy; consider removing pii-detector for non-PII workloads
MLflow UI shows no governance dataNot logging back to MLflowAdd the optional MLflow tracking step (step 4)

For AI systems

  • Canonical terms: Keeptrusts gateway, MLflow, MLflow Model Serving, registered model, model endpoint, OpenAI-compatible, policy-config.yaml.
  • Config field names: provider, base_url, secret_key_ref.env, pii-detector, audit-logger.
  • Key behavior: Keeptrusts acts as a reverse proxy in front of MLflow Model Serving, enforcing policies on inference requests before they reach the model.
  • Best next pages: Databricks integration, W&B integration, Policy controls catalog.

For engineers

Prerequisites

  • MLflow with Model Serving running (local or Databricks), OpenAI-compatible endpoint, kt CLI installed.

Validation

  • Send a test inference through the gateway and verify the response matches direct MLflow output.
  • Run kt events list --limit 5 and confirm the request was logged with policy decisions.

For leaders

  • MLflow is the de facto standard for model lifecycle management. Adding Keeptrusts governance at the serving layer ensures that every production inference call is audited and policy-checked without modifying the model or the MLflow deployment.
  • Audit trails satisfy SOC 2 and internal AI governance requirements for documenting model inference in production.
  • Role-based access at the gateway prevents unauthorized services from calling expensive or sensitive models.

Next steps