Controlling Direct Semantic Replay by Scope
Direct semantic replay can be controlled at four different scopes. When multiple scopes define a value, the most restrictive one wins. This gives you layered control — set a permissive org-wide default and tighten it for specific repos or agents.
Use this page when
- You need to understand how org, repo, agent, and declarative config scopes interact for semantic replay.
- You are troubleshooting why semantic replay is disabled despite enabling it at one scope.
- You want to implement a gradual rollout of semantic replay across your organization.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
The Four Scopes
1. Org Settings
The broadest scope. Set via the console under Settings → Cache → Semantic Replay or via the API.
{
"org_cache_settings": {
"direct_semantic_replay_enabled": true,
"similarity_threshold": 0.95
}
}
This applies to all repos and agents in your organization unless overridden at a narrower scope.
2. Repo Settings
Per-repository overrides. Set via the console under Repos → [repo] → Cache Settings or via the API.
{
"repo_cache_settings": {
"direct_semantic_replay_enabled": true,
"similarity_threshold": 0.93
}
}
This applies to all requests from this specific repository, regardless of which agent handles them.
3. Agent Settings
Per-agent overrides. Set via the console under Agents → [agent] → Cache Settings or via the API.
{
"agent_cache_settings": {
"direct_semantic_replay_enabled": false
}
}
This applies to all requests handled by this specific agent, regardless of which repo they originate from.
4. Declarative Config
Set in your policy YAML deployed to the gateway:
workflow_cache:
direct_semantic_replay_enabled: true
similarity_threshold: 0.95
This is the baseline — it defines the default when no other scope provides a value.
Precedence: Most Restrictive Wins
When multiple scopes define direct_semantic_replay_enabled:
| Org | Repo | Agent | Config | Effective Value |
|---|---|---|---|---|
| true | true | true | true | true |
| true | true | false | true | false |
| true | false | true | true | false |
| false | true | true | true | false |
| true | — | — | true | true |
| — | — | — | true | true |
| — | — | — | false | false |
A false at any scope disables semantic replay for that intersection. A true only takes effect if no scope says false.
For similarity_threshold, the highest (most restrictive) value wins:
| Org | Repo | Agent | Config | Effective Threshold |
|---|---|---|---|---|
| 0.95 | 0.92 | 0.98 | 0.95 | 0.98 |
| 0.95 | — | — | 0.90 | 0.95 |
| — | 0.93 | — | 0.95 | 0.95 |
Use Cases
Org-wide default on, specific repo off
Your organization uses semantic replay everywhere, but a repo with sensitive IP should not serve cached responses:
Org settings: direct_semantic_replay_enabled: true
Repo settings for proprietary-algo: direct_semantic_replay_enabled: false
All other repos benefit from semantic replay. The proprietary-algo repo always goes to the provider.
Org-wide default on, specific agent stricter
Your organization uses a 0.95 threshold, but the security auditor agent needs higher precision:
Org settings: similarity_threshold: 0.95
Agent settings for security-auditor: similarity_threshold: 0.98
The security auditor only gets cache hits when similarity exceeds 0.98. All other agents use 0.95.
Conservative org, permissive team
Your organization defaults to semantic replay off, but the platform team wants to enable it for their repos:
Org settings: direct_semantic_replay_enabled: false
Repo settings for api, cli, console: direct_semantic_replay_enabled: true
This does not work — because the org says false, the most restrictive value wins. To achieve this, set the org to true and disable it for repos that should not have it.
Gradual rollout
Enable semantic replay for one repo first, then expand:
- Org settings:
direct_semantic_replay_enabled: false - Wait — this blocks all repos. Instead, set org to
true. - Org settings:
direct_semantic_replay_enabled: true - Repo settings for all repos except pilot:
direct_semantic_replay_enabled: false - Verify with the pilot repo, then remove per-repo overrides one at a time.
Configuration Examples
Enable everywhere with org-wide settings
{
"org_cache_settings": {
"direct_semantic_replay_enabled": true,
"similarity_threshold": 0.95
}
}
No repo or agent overrides needed. All traffic uses semantic replay at 0.95.
Disable for a sensitive repo
{
"repo_cache_settings": {
"repo_id": "security-keys",
"direct_semantic_replay_enabled": false
}
}
Stricter threshold for a specific agent
{
"agent_cache_settings": {
"agent_id": "compliance-checker",
"similarity_threshold": 0.99
}
}
Disable everywhere via declarative config
workflow_cache:
direct_semantic_replay_enabled: false
This is the simplest way to turn off semantic replay globally. No scope can override a false to true — but note that org/repo/agent settings with false also achieve this per-scope.
Verifying Effective Settings
To check the effective semantic replay settings for a specific request:
- Look at the gateway response header
x-keeptrusts-cache-policywhich reports the resolved settings. - Check the event log for the request — it includes
cache_policy_resolvedwith the effective threshold and enabled state. - Use the console Cache Settings page which shows the merged effective policy for any repo + agent combination.
For AI systems
- Canonical terms: Keeptrusts, semantic replay, scope precedence, most-restrictive-wins,
direct_semantic_replay_enabled,similarity_threshold, org settings, repo settings, agent settings. - Config keys:
org_cache_settings.direct_semantic_replay_enabled,repo_cache_settings.direct_semantic_replay_enabled,agent_cache_settings.direct_semantic_replay_enabled,workflow_cache.direct_semantic_replay_enabled. - Best next pages: Setting Semantic Replay Thresholds, Per-Agent Cache Policies, Declarative Config for Workflow Cache.
For engineers
- Precedence rule: most restrictive wins. A
falseat any scope disables replay for that intersection. - For
similarity_threshold, the highest (most restrictive) value across scopes wins. - Verify effective settings: check
x-keeptrusts-cache-policyresponse header or event logcache_policy_resolvedfield. - Console: Cache Settings page shows merged effective policy for any repo + agent combination.
- Gradual rollout: set org to
true, then disable per-repo for all except your pilot repo. Remove overrides one at a time.
For leaders
- Layered scopes give you org-wide defaults with surgical overrides for sensitive repos or agents.
- Most-restrictive-wins ensures no scope can accidentally relax security controls set at a higher level.
- Use agent-level strictness (e.g., 0.98 threshold) for compliance/security agents while keeping 0.95 for general use.
- A
falseat org scope blocks all semantic replay globally — use this as a kill switch if needed.
Next steps
- Setting Semantic Replay Thresholds — tuning the similarity value
- Per-Agent Cache Policies — agent-level overrides
- Declarative Config for Workflow Cache — baseline config reference