Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Quality Benchmarking Template

Policy configuration for enforcing response quality thresholds.

Use this page when

  • You want to enforce minimum quality standards on AI responses before they reach end users.
  • You need a starting config with model-graded closed-QA and factual assertions that block or escalate low-quality outputs.
  • You want to go from zero to a quality-gated gateway with kt init --template quality-benchmarking.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Policy Config

pack:
name: quality-benchmarking
version: 0.1.0
enabled: true
description: AI response quality assurance
policies:
chain:
- prompt-injection
- quality-scorer
- audit-logger
policy:
prompt-injection:
response:
action: block
message: "Request blocked: potential prompt injection detected"
quality-scorer:
providers:
- id: quality-judge
provider: openai
model: gpt-4o
secret_key_ref:
env: OPENAI_API_KEY
config:
temperature: 0.0
assertions:
- type: llm-rubric
name: closed-qa-correctness
threshold: 0.8
mode: enforce
severity: critical
config:
rubric: Evaluate whether the answer directly resolves the user question and avoids unsupported claims.
- type: factuality
name: factual-grounding
threshold: 0.7
mode: enforce
severity: critical
config:
reference_statement: The response must remain faithful to the approved source material or retrieved context.
pass_policy:
strategy: weighted_average
threshold: 0.75
failure_action:
action: block
audit-logger:
retention_days: 365
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
secret_key_ref:
env: OPENAI_API_KEY

Quick Start

# Save the Policy Config example on this page as policy-config.yaml
export OPENAI_API_KEY="sk-your-openai-key"
kt policy lint --file policy-config.yaml
kt gateway run \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml

Set OPENAI_API_KEY before running the gateway. The upstream provider and the judge model both resolve that secret through secret_key_ref.

If you prefer the seeded starter, run kt init --template quality-benchmarking first and then add the provider block shown in the example config before linting and running.

For AI systems

For engineers

  • Prerequisites: kt CLI installed, an LLM provider API key (e.g., OPENAI_API_KEY).
  • Validate: kt policy lint --file policy-config.yaml must pass.
  • Test: send a query and check that low-quality responses (vague, factually incorrect) are blocked based on the assertion thresholds and pass policy.
  • Key tuning: adjust pass_policy.threshold (0.75 here) and each assertion threshold based on the acceptable quality floor for your use case.

For leaders

  • This template ensures AI responses meet a measurable quality bar before reaching users, reducing the risk of incorrect or low-value outputs.
  • Quality scoring runs as an output-phase policy — it adds latency proportional to the judge model's response time but provides objective quality metrics.
  • Audit logging captures quality scores per request, enabling trend analysis and SLA reporting.
  • Pair with human-oversight for escalation of borderline responses rather than hard blocking.

Next steps