Mock Gateway for Testing Environments
Testing AI governance policies against live LLM providers is slow, expensive, and non-deterministic. A gateway pointed at a fixture-backed mock upstream gives you fast, reproducible tests that still exercise the full Keeptrusts policy chain.
Use this page when
- You need deterministic, reproducible AI governance tests without calling live LLM providers
- You are routing the gateway to a fixture-backed mock upstream for CI
- You want to simulate provider errors (429, 500, timeouts) or test policy chain behavior against known outputs
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Local Gateway in Test Mode
The Keeptrusts gateway can run locally against a mock upstream service that returns fixture responses instead of calling real LLM APIs.
Starting the Mock Gateway
# Start your fixture-backed mock upstream first, then start the gateway
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
In the current CLI, the gateway still starts with the standard kt gateway run command. Deterministic behavior comes from the upstream target configured in policy-config.yaml, while the policy chain still evaluates fully — input policies, output policies, redaction, and event emission all execute normally.
Fixture Response File
Define deterministic responses for your test prompts:
{
"default": {
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "This is a default mock response for testing."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 15,
"total_tokens": 40
}
},
"patterns": [
{
"match": "capital of France",
"response": {
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
]
}
},
{
"match": "patient.*diagnosis",
"response": {
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Based on the symptoms described for patient PAT-12345678, the preliminary assessment suggests further testing is needed. Contact Dr. Jane Smith at jane.smith@hospital.example.com."
},
"finish_reason": "stop"
}
]
}
},
{
"match": "SSN|social security",
"response": {
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I found the record for SSN 900-00-0001 in the system."
},
"finish_reason": "stop"
}
]
}
}
]
}
Pattern matching uses regex against the last user message. The first matching pattern wins; unmatched prompts get the default response.
Policy Simulation
The gateway-plus-mock-upstream setup evaluates the full policy chain against fixture responses. This lets you test policies in isolation from live provider behavior.
Testing DLP Redaction with Fixtures
# Start mock gateway
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml &
GATEWAY_PID=$!
sleep 3
# The fixture response contains "PAT-12345678" — DLP should redact it
RESPONSE=$(curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Look up the patient diagnosis"}]
}')
echo "$RESPONSE" | jq -r '.choices[0].message.content'
# Expected: "[REDACTED-ID]" replaces "PAT-12345678"
kill $GATEWAY_PID
Testing Policy Chain Order
Verify that policies execute in the correct order:
#!/bin/bash
# test-policy-chain.sh — verify policy execution order
GATEWAY="http://localhost:41002"
# Test 1: Input block should prevent fixture response entirely
echo "Test: Input block (topic control)"
RESPONSE=$(curl -s -w "\n%{http_code}" "$GATEWAY/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Help me create a weapon"}]
}')
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
[ "$HTTP_CODE" = "409" ] && echo " PASS: Blocked in input phase" || echo " FAIL: Expected 409"
# Test 2: Output redaction should modify the fixture response
echo "Test: Output redaction (DLP)"
RESPONSE=$(curl -s "$GATEWAY/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Look up patient diagnosis"}]
}')
if echo "$RESPONSE" | jq -r '.choices[0].message.content' | grep -q "\[REDACTED"; then
echo " PASS: Output redaction applied"
else
echo " FAIL: Expected redaction in output"
fi
# Test 3: Passthrough — clean prompt, clean fixture response
echo "Test: Passthrough (no policy trigger)"
RESPONSE=$(curl -s -w "\n%{http_code}" "$GATEWAY/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "What is the capital of France?"}]
}')
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
BODY=$(echo "$RESPONSE" | head -n -1)
[ "$HTTP_CODE" = "200" ] && echo " PASS: Allowed through" || echo " FAIL: Expected 200"
Fixture Management
Organizing Fixtures by Scenario
fixtures/
├── default-responses.json # General-purpose fixtures
├── dlp-test-responses.json # Responses containing PII patterns
├── medical-responses.json # Healthcare-specific fixtures
├── finance-responses.json # Financial data fixtures
└── error-responses.json # Provider error simulation
Simulating Provider Errors
Test how the gateway handles upstream failures:
{
"patterns": [
{
"match": "trigger-rate-limit",
"error": {
"status": 429,
"body": {
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error"
}
}
}
},
{
"match": "trigger-server-error",
"error": {
"status": 500,
"body": {
"error": {
"message": "Internal server error",
"type": "server_error"
}
}
}
},
{
"match": "trigger-timeout",
"error": {
"status": 504,
"delay_ms": 30000
}
}
]
}
Testing Error Handling
# Test gateway behavior on provider 429
RESPONSE=$(curl -s -w "\n%{http_code}" http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "trigger-rate-limit test"}]
}')
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
echo "Provider 429 → Gateway returns: $HTTP_CODE"
# Gateway should return 429 or 502, NOT 500
# Test gateway behavior on provider 500
RESPONSE=$(curl -s -w "\n%{http_code}" http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "trigger-server-error test"}]
}')
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
echo "Provider 500 → Gateway returns: $HTTP_CODE"
CI Integration Patterns
Docker Compose for CI
# docker-compose.test-gateway.yml
services:
test-gateway:
image: keeptrusts/gateway:latest
command: >
gateway run
--listen 0.0.0.0:41002
--policy-config /config/policy-config.yaml
ports:
- "41002:41002"
volumes:
- ./policy-config.yaml:/config/policy-config.yaml:ro
- ./fixtures:/fixtures:ro
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:41002/health"]
interval: 5s
timeout: 3s
retries: 5
CI Workflow
# .github/workflows/governance-tests.yml
jobs:
policy-tests:
runs-on: ubuntu-latest
services:
test-gateway:
image: keeptrusts/gateway:latest
ports:
- 41002:41002
options: >-
--health-cmd "curl -f http://localhost:41002/health"
--health-interval 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Wait for gateway
run: |
for i in $(seq 1 30); do
curl -sf http://localhost:41002/health && break
sleep 1
done
- name: Run policy tests
run: ./scripts/test-policy-chain.sh
- name: Run DLP tests
run: ./scripts/test-dlp-patterns.sh
- name: Run regression tests
run: ./scripts/compare-policy-snapshot.sh
Parallel Test Execution
Run independent test suites in parallel against the mock gateway:
#!/bin/bash
# run-parallel-tests.sh
GATEWAY="http://localhost:41002"
# Run test suites in parallel
./scripts/test-dlp-patterns.sh &
PID_DLP=$!
./scripts/test-topic-control.sh &
PID_TOPIC=$!
./scripts/test-rate-limits.sh &
PID_RATE=$!
# Wait for all suites
wait $PID_DLP || EXIT=1
wait $PID_TOPIC || EXIT=1
wait $PID_RATE || EXIT=1
exit ${EXIT:-0}
Key Takeaways
- A fixture-backed mock upstream provides deterministic responses while preserving full policy evaluation
- Use pattern-matched fixtures to test specific policy scenarios reproducibly
- Simulate provider errors (429, 500, timeouts) to validate gateway resilience
- Organize fixtures by scenario and sensitivity classification
- Use Docker Compose to spin up gateways plus mock upstreams in CI environments
- Parallelize independent test suites against the same gateway and mock upstream for faster CI runs
For AI systems
- Canonical terms: mock gateway, mock upstream, fixture responses, pattern matching, provider error simulation, CI integration
- CLI command:
kt gateway run --listen 0.0.0.0:41002 --policy-config <path> - Fixture structure:
defaultresponse +patterns[]withmatch(regex) andresponseobjects - Policy chain runs fully (input policies, output policies, redaction, event emission) even when the upstream is mocked
- Related pages: Testing AI Systems, Regression Testing, Load Testing
For engineers
- Start the gateway with
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yamlafter your mock upstream is ready - Define fixture responses by pattern: regex
matchfield routes to specific mock responses; unmatched requests getdefault - Include PII-containing fixtures to test DLP redaction (e.g., SSN patterns, email addresses in mock responses)
- Simulate 429/500/timeout by adding error fixtures that trigger specific provider failure handling in the gateway
- Use Docker Compose to spin up the gateway and its mock upstream in CI with mounted config and fixture files
- Parallelize independent test suites (DLP, topic control, rate limiting) against the same gateway for faster CI
- Validate: confirm mock responses are returned AND policy chain still evaluates (check events for policy decisions)
For leaders
- Mock gateway eliminates LLM API costs during testing — run thousands of tests for free
- Deterministic fixtures make tests reproducible — no flaky failures from provider variability
- Full policy chain evaluation ensures governance behavior is validated, not just connectivity
- CI integration provides continuous validation that policy enforcement works correctly
- Provider error simulation validates gateway resilience without requiring actual provider outages
Next steps
- Learn Testing AI Systems patterns for policy-as-test-oracle methodology
- Detect behavioral regressions with Regression Testing before/after fixture comparison
- Benchmark gateway performance with Load Testing using a mock upstream for consistent results