Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Controlling Direct Semantic Replay by Scope

Direct semantic replay can be controlled at four different scopes. When multiple scopes define a value, the most restrictive one wins. This gives you layered control — set a permissive org-wide default and tighten it for specific repos or agents.

Use this page when

  • You need to understand how org, repo, agent, and declarative config scopes interact for semantic replay.
  • You are troubleshooting why semantic replay is disabled despite enabling it at one scope.
  • You want to implement a gradual rollout of semantic replay across your organization.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

The Four Scopes

1. Org Settings

The broadest scope. Set via the console under Settings → Cache → Semantic Replay or via the API.

{
"org_cache_settings": {
"direct_semantic_replay_enabled": true,
"similarity_threshold": 0.95
}
}

This applies to all repos and agents in your organization unless overridden at a narrower scope.

2. Repo Settings

Per-repository overrides. Set via the console under Repos → [repo] → Cache Settings or via the API.

{
"repo_cache_settings": {
"direct_semantic_replay_enabled": true,
"similarity_threshold": 0.93
}
}

This applies to all requests from this specific repository, regardless of which agent handles them.

3. Agent Settings

Per-agent overrides. Set via the console under Agents → [agent] → Cache Settings or via the API.

{
"agent_cache_settings": {
"direct_semantic_replay_enabled": false
}
}

This applies to all requests handled by this specific agent, regardless of which repo they originate from.

4. Declarative Config

Set in your policy YAML deployed to the gateway:

workflow_cache:
direct_semantic_replay_enabled: true
similarity_threshold: 0.95

This is the baseline — it defines the default when no other scope provides a value.

Precedence: Most Restrictive Wins

When multiple scopes define direct_semantic_replay_enabled:

OrgRepoAgentConfigEffective Value
truetruetruetruetrue
truetruefalsetruefalse
truefalsetruetruefalse
falsetruetruetruefalse
truetruetrue
truetrue
falsefalse

A false at any scope disables semantic replay for that intersection. A true only takes effect if no scope says false.

For similarity_threshold, the highest (most restrictive) value wins:

OrgRepoAgentConfigEffective Threshold
0.950.920.980.950.98
0.950.900.95
0.930.950.95

Use Cases

Org-wide default on, specific repo off

Your organization uses semantic replay everywhere, but a repo with sensitive IP should not serve cached responses:

Org settings: direct_semantic_replay_enabled: true

Repo settings for proprietary-algo: direct_semantic_replay_enabled: false

All other repos benefit from semantic replay. The proprietary-algo repo always goes to the provider.

Org-wide default on, specific agent stricter

Your organization uses a 0.95 threshold, but the security auditor agent needs higher precision:

Org settings: similarity_threshold: 0.95

Agent settings for security-auditor: similarity_threshold: 0.98

The security auditor only gets cache hits when similarity exceeds 0.98. All other agents use 0.95.

Conservative org, permissive team

Your organization defaults to semantic replay off, but the platform team wants to enable it for their repos:

Org settings: direct_semantic_replay_enabled: false

Repo settings for api, cli, console: direct_semantic_replay_enabled: true

This does not work — because the org says false, the most restrictive value wins. To achieve this, set the org to true and disable it for repos that should not have it.

Gradual rollout

Enable semantic replay for one repo first, then expand:

  1. Org settings: direct_semantic_replay_enabled: false
  2. Wait — this blocks all repos. Instead, set org to true.
  3. Org settings: direct_semantic_replay_enabled: true
  4. Repo settings for all repos except pilot: direct_semantic_replay_enabled: false
  5. Verify with the pilot repo, then remove per-repo overrides one at a time.

Configuration Examples

Enable everywhere with org-wide settings

{
"org_cache_settings": {
"direct_semantic_replay_enabled": true,
"similarity_threshold": 0.95
}
}

No repo or agent overrides needed. All traffic uses semantic replay at 0.95.

Disable for a sensitive repo

{
"repo_cache_settings": {
"repo_id": "security-keys",
"direct_semantic_replay_enabled": false
}
}

Stricter threshold for a specific agent

{
"agent_cache_settings": {
"agent_id": "compliance-checker",
"similarity_threshold": 0.99
}
}

Disable everywhere via declarative config

workflow_cache:
direct_semantic_replay_enabled: false

This is the simplest way to turn off semantic replay globally. No scope can override a false to true — but note that org/repo/agent settings with false also achieve this per-scope.

Verifying Effective Settings

To check the effective semantic replay settings for a specific request:

  1. Look at the gateway response header x-keeptrusts-cache-policy which reports the resolved settings.
  2. Check the event log for the request — it includes cache_policy_resolved with the effective threshold and enabled state.
  3. Use the console Cache Settings page which shows the merged effective policy for any repo + agent combination.

For AI systems

  • Canonical terms: Keeptrusts, semantic replay, scope precedence, most-restrictive-wins, direct_semantic_replay_enabled, similarity_threshold, org settings, repo settings, agent settings.
  • Config keys: org_cache_settings.direct_semantic_replay_enabled, repo_cache_settings.direct_semantic_replay_enabled, agent_cache_settings.direct_semantic_replay_enabled, workflow_cache.direct_semantic_replay_enabled.
  • Best next pages: Setting Semantic Replay Thresholds, Per-Agent Cache Policies, Declarative Config for Workflow Cache.

For engineers

  • Precedence rule: most restrictive wins. A false at any scope disables replay for that intersection.
  • For similarity_threshold, the highest (most restrictive) value across scopes wins.
  • Verify effective settings: check x-keeptrusts-cache-policy response header or event log cache_policy_resolved field.
  • Console: Cache Settings page shows merged effective policy for any repo + agent combination.
  • Gradual rollout: set org to true, then disable per-repo for all except your pilot repo. Remove overrides one at a time.

For leaders

  • Layered scopes give you org-wide defaults with surgical overrides for sensitive repos or agents.
  • Most-restrictive-wins ensures no scope can accidentally relax security controls set at a higher level.
  • Use agent-level strictness (e.g., 0.98 threshold) for compliance/security agents while keeping 0.95 for general use.
  • A false at org scope blocks all semantic replay globally — use this as a kill switch if needed.

Next steps