Quality-Aware Retry
How It Works
The gateway detects quality issues in provider responses:
- Empty responses
- Truncated output
- Content-filtered responses
- Malformed JSON (when structured output is expected)
When detected, the gateway retries with the next provider in the compliant candidate set — never widening the security envelope.
Configuration
quality_retry:
enabled: true
max_retries: 2
retry_on:
- empty_response
- truncated_response
- content_filtered
- malformed_json
Limitations
- Streaming responses are never retried — quality cannot be assessed mid-stream
- Retries stay within the compliant candidate set — security constraints are never relaxed
- Max retries is capped — prevents infinite retry loops