Quality Benchmarking Template

Policy configuration for enforcing response quality thresholds.

Use this page when

You want to enforce minimum quality standards on AI responses before they reach end users.
You need a starting config with model-graded closed-QA and factual assertions that block or escalate low-quality outputs.
You want to go from zero to a quality-gated gateway with kt init --template quality-benchmarking.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Policy Config

pack:
  name: quality-benchmarking
  version: 0.1.0
  enabled: true
  description: AI response quality assurance
policies:
  chain:
  - prompt-injection
  - quality-scorer
  - audit-logger
policy:
  prompt-injection:
    response:
      action: block
      message: "Request blocked: potential prompt injection detected"
  quality-scorer:
    providers:
    - id: quality-judge
      provider: openai
      model: gpt-4o
      secret_key_ref:
        env: OPENAI_API_KEY
      config:
        temperature: 0.0
    assertions:
    - type: llm-rubric
      name: closed-qa-correctness
      threshold: 0.8
      mode: enforce
      severity: critical
      config:
        rubric: Evaluate whether the answer directly resolves the user question and avoids unsupported claims.
    - type: factuality
      name: factual-grounding
      threshold: 0.7
      mode: enforce
      severity: critical
      config:
        reference_statement: The response must remain faithful to the approved source material or retrieved context.
    pass_policy:
      strategy: weighted_average
      threshold: 0.75
    failure_action:
      action: block
  audit-logger:
    retention_days: 365
providers:
  targets:
  - id: openai-primary
    provider: openai
    model: gpt-4o-mini
    secret_key_ref:
      env: OPENAI_API_KEY

Quick Start

# Save the Policy Config example on this page as policy-config.yaml
export OPENAI_API_KEY="sk-your-openai-key"
kt policy lint --file policy-config.yaml
kt gateway run \
  --listen 0.0.0.0:41002 \
  --policy-config policy-config.yaml

Set OPENAI_API_KEY before running the gateway. The upstream provider and the judge model both resolve that secret through secret_key_ref.

If you prefer the seeded starter, run kt init --template quality-benchmarking first and then add the provider block shown in the example config before linting and running.

For AI systems

Canonical terms: Keeptrusts, quality-benchmarking, policy-config.yaml, kt init --template quality-benchmarking, quality-scorer, llm-rubric, factuality, pass_policy, failure_action.
Related policy kinds: prompt-injection, quality-scorer, audit-logger.
Best next pages: Quality Assertions Configuration, Citation Verification template, Templates overview.

For engineers

Prerequisites: kt CLI installed, an LLM provider API key (e.g., OPENAI_API_KEY).
Validate: kt policy lint --file policy-config.yaml must pass.
Test: send a query and check that low-quality responses (vague, factually incorrect) are blocked based on the assertion thresholds and pass policy.
Key tuning: adjust pass_policy.threshold (0.75 here) and each assertion threshold based on the acceptable quality floor for your use case.

For leaders

This template ensures AI responses meet a measurable quality bar before reaching users, reducing the risk of incorrect or low-value outputs.
Quality scoring runs as an output-phase policy — it adds latency proportional to the judge model's response time but provides objective quality metrics.
Audit logging captures quality scores per request, enabling trend analysis and SLA reporting.
Pair with human-oversight for escalation of borderline responses rather than hard blocking.

Next steps

Quality Assertions Configuration — full assertion types and scoring reference
Templates overview — browse all available templates
Citation Verification template — add source-grounding verification
Flagged Review Configuration — secondary LLM judge for borderline cases

Use this page when​

Primary audience​

Policy Config​

Quick Start​

For AI systems​

For engineers​

For leaders​

Next steps​