Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Accessibility Testing AI Outputs

AI-generated content must be accessible to all users, including those with disabilities. When AI systems produce content that fails accessibility standards, the impact scales rapidly — a single misconfigured model can generate thousands of inaccessible outputs. Governance policies enforce accessibility standards at the gateway level, ensuring every AI output meets WCAG guidelines before reaching users.

Use this page when

  • You need to enforce readability and accessibility standards on AI-generated content at the gateway level
  • You are configuring readability scoring policies, alt-text governance, or inclusive language filters
  • You want to build CI-integrated accessibility test suites for AI outputs (WCAG compliance)

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

AI-Generated Content Accessibility Challenges

AI outputs introduce unique accessibility concerns:

  • Unstructured text — LLMs may produce wall-of-text responses without headings or lists
  • Complex language — technical jargon and high reading levels exclude users with cognitive disabilities
  • Missing alt-text — AI-generated image descriptions may be absent or inadequate
  • Inconsistent formatting — response structure varies between requests, breaking screen readers
  • Emoji and symbols — overuse of visual elements that don't translate to assistive technology

Readability Scoring Policies

Configure policies that evaluate the reading level of AI responses:

# policy-config.yaml — readability policies
policies:
- name: readability-gate
type: quality
action: flag
thresholds:
max_flesch_kincaid_grade: 10
max_sentence_length: 35
max_paragraph_length: 150

- name: plain-language-enforcement
type: quality
action: escalate
thresholds:
max_flesch_kincaid_grade: 8
contexts:
- public-facing
- customer-support

Testing Readability Enforcement

# Send a prompt that may generate complex language
RESPONSE=$(curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "Explain technical concepts in simple terms suitable for a general audience."},
{"role": "user", "content": "How does encryption protect my data?"}
]
}')

# Check event for readability scores
EVENT=$(kt events list --last 1 --format json)
echo "$EVENT" | jq '.[0].quality_scores.readability'

Expected readability metadata:

{
"flesch_kincaid_grade": 7.2,
"avg_sentence_length": 18,
"avg_paragraph_length": 65,
"complex_word_percentage": 12
}

Readability Validation Script

#!/bin/bash
# test-readability.sh — validate AI output readability

PROMPTS_FILE="accessibility-test-prompts.json"
GATEWAY="http://localhost:41002"
MAX_GRADE=10
FAILURES=0

jq -c '.[]' "$PROMPTS_FILE" | while read -r prompt; do
ID=$(echo "$prompt" | jq -r '.id')
REQUEST=$(echo "$prompt" | jq -c '.request')

curl -s "$GATEWAY/v1/chat/completions" \
-H "Content-Type: application/json" \
-d "$REQUEST" > /dev/null

EVENT=$(kt events list --last 1 --format json)
GRADE=$(echo "$EVENT" | jq '.[0].quality_scores.readability.flesch_kincaid_grade // 99')

if awk "BEGIN {exit !($GRADE > $MAX_GRADE)}"; then
echo "FAIL [$ID]: Reading grade $GRADE exceeds max $MAX_GRADE"
FAILURES=$((FAILURES + 1))
else
echo "PASS [$ID]: Reading grade $GRADE"
fi
done

[ "$FAILURES" -eq 0 ] && echo "All readability tests passed" || exit 1

Alt-Text Generation Governance

When AI generates or describes images, governance policies ensure alt-text meets accessibility standards.

Alt-Text Quality Policies

policies:
- name: alt-text-quality
type: quality
action: flag
thresholds:
min_alt_text_length: 20
max_alt_text_length: 250
contexts:
- image-description
- content-generation

- name: alt-text-descriptiveness
type: quality
action: escalate
thresholds:
min_descriptiveness_score: 0.6
contexts:
- image-description

Testing Alt-Text Outputs

# Test alt-text generation quality
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "Generate alt-text for the described image. Follow WCAG 2.1 guidelines."},
{"role": "user", "content": "Describe this image for alt-text: A bar chart showing quarterly revenue growth from Q1 to Q4 2025."}
]
}' | jq '.choices[0].message.content'

# Verify alt-text meets length requirements
RESPONSE_LENGTH=$(kt events list --last 1 --format json | \
jq '.[0].response.content | length')

if [ "$RESPONSE_LENGTH" -ge 20 ] && [ "$RESPONSE_LENGTH" -le 250 ]; then
echo "PASS: Alt-text length ($RESPONSE_LENGTH chars) within acceptable range"
else
echo "FAIL: Alt-text length ($RESPONSE_LENGTH chars) outside 20-250 char range"
fi

WCAG Compliance Checks

Map AI output policies to specific WCAG 2.1 success criteria:

WCAG CriterionGovernance PolicyTest Approach
1.1.1 Non-text Contentalt-text-qualityVerify alt-text present and descriptive
1.3.1 Info and Relationshipsstructure-enforcementCheck for headings, lists in long responses
3.1.5 Reading Levelreadability-gateFlesch-Kincaid grade ≤ 10
3.3.2 Labels or Instructionsclarity-checkVerify instructions are explicit
4.1.1 Parsingformat-validationEnsure valid HTML/Markdown output

Structure Enforcement Policy

Ensure long AI responses use proper document structure:

policies:
- name: structure-enforcement
type: quality
action: flag
thresholds:
min_headings_per_1000_chars: 1
require_list_for_enumeration: true
conditions:
min_response_length: 500

Testing Structured Output

#!/bin/bash
# test-structure.sh — verify AI outputs use proper structure

RESPONSE=$(curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "List the top 10 benefits of cloud computing with detailed explanations"}]
}' | jq -r '.choices[0].message.content')

# Check for heading markers
HEADINGS=$(echo "$RESPONSE" | grep -c "^#")
if [ "$HEADINGS" -gt 0 ]; then
echo "PASS: Response contains $HEADINGS heading(s)"
else
echo "WARN: No headings found in long response"
fi

# Check for list markers
LISTS=$(echo "$RESPONSE" | grep -cE "^[-*\d]\.|^- ")
if [ "$LISTS" -gt 0 ]; then
echo "PASS: Response contains $LISTS list item(s)"
else
echo "WARN: No list formatting found for enumeration request"
fi

Inclusive Language Validation

Policies can flag non-inclusive language in AI outputs:

policies:
- name: inclusive-language
type: content_filter
action: flag
patterns:
- name: gendered-defaults
terms: ["he/she", "mankind", "manpower", "chairman"]
suggestion: "Use gender-neutral alternatives"
- name: ableist-language
terms: ["blind spot", "tone deaf", "falling on deaf ears"]
suggestion: "Use disability-neutral alternatives"

Accessibility Test Suite

Combine all accessibility checks into a comprehensive test suite:

#!/bin/bash
# run-accessibility-tests.sh

echo "=== Accessibility Test Suite ==="

echo "--- Readability Tests ---"
./scripts/test-readability.sh

echo "--- Alt-Text Quality Tests ---"
./scripts/test-alt-text.sh

echo "--- Structure Tests ---"
./scripts/test-structure.sh

echo "--- Inclusive Language Tests ---"
./scripts/test-inclusive-language.sh

echo "=== Accessibility Suite Complete ==="

CI Integration

# .github/workflows/accessibility-gate.yml
jobs:
accessibility:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Start test gateway
run: |
kt gateway run --policy-config policy-config.yaml --port 41002 &
sleep 3

- name: Run accessibility tests
run: ./scripts/run-accessibility-tests.sh

- name: Generate accessibility report
if: always()
run: ./scripts/generate-a11y-report.sh

Key Takeaways

  • AI outputs require accessibility testing at scale — governance policies enforce standards on every response
  • Readability scoring policies prevent complex, exclusionary language in public-facing AI outputs
  • Alt-text governance ensures AI-generated image descriptions meet WCAG guidelines
  • Structure enforcement policies promote proper headings, lists, and formatting in long responses
  • Inclusive language filters flag potentially exclusionary terms for review
  • Integrate accessibility tests into CI alongside functional policy tests

For AI systems

  • Canonical terms: readability scoring, Flesch-Kincaid grade, alt-text governance, inclusive language, WCAG, type: quality policy, action: flag, action: escalate
  • Policy config keys: thresholds.max_flesch_kincaid_grade, thresholds.max_sentence_length, thresholds.max_paragraph_length
  • Event metadata: quality_scores.readability with flesch_kincaid_grade, avg_sentence_length, complex_word_percentage
  • CLI commands: kt events list --last 1 --format json
  • Related pages: Quality Scoring, Testing AI Systems, Compliance Testing

For engineers

  • Configure type: quality policies with max_flesch_kincaid_grade: 8 for public-facing outputs and 10 for internal
  • Use contexts field to apply stricter readability to specific use cases (customer-support, public-facing)
  • Query quality_scores.readability from gateway events to validate that scoring is active
  • Build a test script that sends prompts through the gateway and asserts flesch_kincaid_grade < threshold
  • Add accessibility tests to CI alongside functional policy tests — run against the mock gateway for speed
  • Validate: send a complex prompt, check the event for readability metadata, confirm escalation fires if grade exceeds threshold

For leaders

  • A single misconfigured model can generate thousands of inaccessible outputs — gateway-level enforcement catches this at scale
  • Readability policies ensure AI content is accessible to users with cognitive disabilities
  • Escalation on readability failures surfaces issues for human review without blocking all traffic
  • WCAG compliance for AI outputs reduces legal exposure and broadens audience reach
  • Accessibility testing in CI prevents regression when models or prompts change

Next steps