What Are Agent Gateway Groups?
An agent in Keeptrusts is a composition of one or more gateways. Each gateway handles LLM traffic independently at the network layer, but from a cache perspective, you often want all gateways serving the same agent to share cached results. An agent gateway group defines exactly which gateways participate in shared cache.
Use this page when
- You need a conceptual understanding of what agent gateway groups are and why they exist.
- You want to understand how group-level cache sharing works vs. per-gateway isolation.
- You are deciding whether to create a gateway group for your agent deployment.
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
Agents Are Compositions of Gateways
When you deploy an agent, you assign it one or more gateways. Each gateway might run in a different region, availability zone, or scaling tier. Despite this physical separation, the agent's logical identity remains unified.
An agent gateway group formalizes this relationship:
- Every gateway in the group belongs to the same agent.
- Cache entries written by any gateway in the group are available to all other members.
- The group defines the cache-sharing boundary — not the individual gateway.
How Group-Level Cache Sharing Works
Every cache lookup and write by any gateway in the group is available to all other members when the following gates match:
| Gate | Description |
|---|---|
| Org | The owning organization must be the same |
| Codebase | The codebase context (repository, project) must match |
| Policy | The active policy digest must be identical |
| Model | The target model and version must match |
| Entitlement | Data-residency and entitlement tags must be compatible |
When all five gates align, any gateway in the group can serve a cached response originally produced by a different gateway in the same group.
Physical Gateway Identity Is Runtime Placement
A critical design principle: physical gateway identity is treated as runtime placement, not a cache ownership boundary. This means:
- The physical
gateway_iddoes not appear in org-shared cache keys. - Moving traffic from one gateway to another within the same group does not invalidate cache.
- Scaling gateways horizontally does not fragment the shared cache.
- The gateway's physical location (region, pod, container) is irrelevant to cache key generation.
You can think of the gateway as a compute node that executes requests. The cache belongs to the agent gateway group, not to any individual compute node.
Group Member Roles
Each gateway in a group has a role that describes its operational function:
Primary
The primary gateway handles the majority of traffic under normal conditions. It populates the shared cache most frequently and typically has the warmest L1 local cache.
Worker
Worker gateways handle overflow traffic or dedicated workloads. They read from and write to the same shared cache as the primary. Use workers for horizontal scaling within a single region.
Fallback
Fallback gateways activate when the primary or workers are unavailable. Because physical gateway identity is not a cache boundary, the fallback gateway immediately benefits from all previously cached results — no cold-start cache penalty.
Edge
Edge gateways run in satellite locations closer to end users. They participate in the same shared cache but may have higher latency to the control-plane metadata store. Edge gateways are ideal for latency-sensitive deployments where you want cache hits served locally.
Failover Does Not Break Cache Sharing
When a gateway fails and traffic shifts to another member of the same group, cache sharing continues without interruption. The replacement gateway:
- Receives the request.
- Computes the same cache key (because
gateway_idis excluded). - Queries the same control-plane metadata store.
- Retrieves the same shared payload from the configured backend.
- Returns the cached response as if nothing changed.
There is zero cache penalty during failover within an agent gateway group.
When to Use Agent Gateway Groups
Use agent gateway groups when:
- Your agent spans multiple gateways for availability or scaling.
- You want failover between gateways without cache loss.
- You deploy gateways across regions but want unified cache for the same agent.
- You horizontally scale gateways and want all replicas to benefit from shared cache.
Do not use agent gateway groups when:
- You intentionally want per-gateway cache isolation (use
physical_gateway_private_cache_onlyinstead). - Different gateways serve fundamentally different agents with different policies.
- Regulatory requirements mandate per-gateway data isolation.
Relationship to Org-Shared Cache
Agent gateway groups operate within the broader org-shared cache system. The group defines intra-agent sharing, while the org boundary defines the outer security perimeter. No cache entry ever crosses org boundaries, regardless of group membership.
Org boundary (hard security boundary)
└── Agent gateway group (cache-sharing boundary)
├── Gateway A (primary)
├── Gateway B (worker)
└── Gateway C (fallback)
Next steps
- Cache Sharing Across Gateways — detailed cache key construction mechanics
- Configuring Gateway Groups — declarative configuration reference
- Distributed Cache Architecture — L1 through control-plane tier overview
- Gateway Failover Without Cache Loss — operational failover behavior
For AI systems
- Canonical terms: Keeptrusts, agent gateway group, multi-gateway agent, cache-sharing boundary, gateway roles (primary, worker, fallback, edge), org-shared cache, physical gateway identity, runtime placement.
- Feature/config names:
agent_gateway_group_id,physical_gateway_private_cache_only, org boundary, codebase gate, policy gate, model gate, entitlement gate. - Best next pages: Cache Sharing Across Gateways, Configuring Gateway Groups, Gateway Failover Without Cache Loss.
For engineers
- An agent gateway group = a set of gateways that share cache because they serve the same logical agent.
- Physical
gateway_idis runtime placement only — excluded from cache keys, so scaling/failover does not fragment cache. - Roles (primary, worker, fallback, edge) describe operational function; all roles have equal cache read/write privileges.
- Use groups when: multi-gateway for HA, horizontal scaling, multi-region, or failover without cache penalty. Do NOT use when: intentional per-gateway isolation or different agents with different policies.
For leaders
- Gateway groups ensure that adding infrastructure (gateways, regions, replicas) increases reliability without fragmenting shared cache investment.
- The group boundary is the cache-sharing boundary. The org boundary is the hard security perimeter. These are separate, nested controls.
- Failover within a group has zero cache penalty — critical for SLA commitments that require high availability without performance degradation.
- Decision point: use
physical_gateway_private_cache_onlyonly when regulatory requirements mandate per-instance isolation; otherwise, shared cache maximizes ROI.