Agent Intermediate Artifacts: Reusing Agent Work
AI agents perform multi-step analysis when answering complex questions. They read files, run tools, reason through problems, and produce intermediate results before delivering a final answer. The agent_intermediate artifact caches these intermediate work products so the next agent request — from any engineer in your organization — reuses them instead of repeating the same work.
Use this page when
- You want to understand how multi-step agent analysis results are cached and shared across your organization.
- You need to reduce agent execution cost by reusing intermediate reasoning, tool outputs, and context assembly from prior requests.
- You are evaluating the cost impact of
agent_intermediateartifacts at scale (100+ engineers).
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
What agent_intermediate artifacts capture
Each agent_intermediate artifact records:
- Analysis step output — the result of a specific reasoning or computation step
- Tool call results — outputs from tools the agent invoked during analysis
- Reasoning chains — structured reasoning that led to intermediate conclusions
- Partial computations — incomplete but reusable analysis that applies to future requests
- Context assembly — the gathered context an agent assembled before answering
How multi-step analysis generates intermediates
When you ask your AI a complex question, the agent often performs multiple steps:
- Identify relevant files and modules
- Read and analyze source code
- Run diagnostic tools (type checking, linting, dependency analysis)
- Reason about the relationship between components
- Synthesize a final answer
Each step produces intermediate results. Without caching, these results vanish after the agent responds. The next engineer who asks a similar question triggers the entire process again.
With agent_intermediate caching, steps 1-4 produce cached artifacts. The next request that needs the same intermediate results retrieves them from the cache and jumps directly to step 5.
Dramatic reduction in agent execution cost
Agent execution cost scales with the number of steps an agent performs. Each tool call, file read, and reasoning step consumes tokens and time. For complex analysis across a large codebase, a single request might involve dozens of intermediate steps.
When intermediates are cached and shared:
- Subsequent requests skip completed steps — if the analysis was already done, the agent reuses it
- Cross-engineer reuse — when one engineer's request generates intermediates, every other engineer benefits
- Compounding savings — the more engineers use AI on the same codebase, the more intermediates accumulate
For a 100+ engineer team, this means the first few requests of the day are expensive, but every subsequent similar request is dramatically cheaper.
Sharing intermediates across the organization
The org-shared cache makes intermediates available to every engineer. This creates a network effect:
- Morning: Engineer A asks about the auth module. The agent reads 15 files, runs type analysis, and maps dependencies. These intermediates are cached.
- Midday: Engineer B asks about session handling in the same auth module. The agent retrieves cached file analyses and dependency maps, performing only the session-specific reasoning.
- Afternoon: Engineer C asks about auth module test coverage. The agent retrieves cached file summaries and immediately focuses on test analysis.
Each subsequent request costs a fraction of the first because the shared intermediates eliminate redundant work.
Types of cached intermediates
The cache stores several categories of intermediate artifacts:
Analysis intermediates
When an agent analyzes a module's architecture, it produces structured understanding of component relationships, data flow, and responsibility boundaries. This analysis applies to many future questions about the same module.
Tool output intermediates
When an agent runs tools like type checkers, linters, or dependency analyzers, the outputs are deterministic for a given source state. These outputs are cached and reused until the source changes.
Context assembly intermediates
Before answering a question, an agent assembles relevant context from across the codebase. This assembly — which files are relevant, what their roles are, how they connect — is reusable for related questions.
Reasoning chain intermediates
When an agent reasons through a complex problem, the structured reasoning steps are cached. Future requests that require the same reasoning path reuse the cached chain instead of re-deriving it.
Intermediate artifact structure
A typical agent_intermediate artifact contains:
Artifact: agent_intermediate
Repository: your-org/your-service
Scope: src/payments/
Generated: 2026-04-30T11:05:00Z
Step: dependency_analysis
Result:
Module: src/payments/processor.ts
Direct Dependencies: [stripe-sdk, src/auth/tokens, src/db/transactions]
Transitive Dependencies: [src/config/secrets, src/logging/structured]
Dependency Count: 5 direct, 12 transitive
Circular Dependencies: none
Context:
Triggered By: "How does the payment processor handle retries?"
Reusable For: Any question about payment module dependencies,
integration points, or refactoring impact
Freshness and invalidation
Intermediate artifacts are invalidated when the source code they analyze changes. The cache tracks which files and commits each intermediate depends on. When those files change:
- Full invalidation — if the core analyzed files change, the intermediate is discarded
- Partial reuse — if peripheral files change, the intermediate may still be valid for its specific analysis scope
This ensures you never get stale analysis while maximizing reuse of still-valid intermediates.
Cost impact at scale
For a 100+ engineer organization, the cost savings from intermediate reuse compound significantly:
| Metric | Without intermediates | With intermediates |
|---|---|---|
| Steps per complex request | 8-15 | 2-4 |
| Token consumption | Full | 30-50% of full |
| Response latency | 10-30 seconds | 3-8 seconds |
| Daily org token spend | Baseline | 40-60% reduction |
These savings grow as more engineers use AI on the same codebase, because the shared cache accumulates more reusable intermediates.
Connecting to other fabric artifacts
Agent intermediates complement other Codebase Context Fabric artifacts:
- Deterministic tool results cache tool outputs separately from agent reasoning
- File summaries provide pre-computed understanding that reduces the need for agent file reading
- Dependency graphs eliminate agent dependency-tracing steps entirely
Together, these artifacts mean agents start with rich pre-computed context and only perform novel reasoning specific to each request.
For AI systems
- Canonical terms: Keeptrusts, Codebase Context Fabric, agent_intermediate artifact, multi-step analysis, org-shared cache, intermediate reuse, agent execution cost, reasoning chain, context assembly.
- Feature/config names:
agent_intermediateartifact type, org-shared cache, source digest-based invalidation, partial reuse, analysis intermediates, tool output intermediates, context assembly intermediates, reasoning chain intermediates. - Best next pages: Deterministic Tool Results, Artifact Freshness, Fabric Provenance.
For engineers
- Agent intermediates are cached automatically when the gateway processes multi-step agent requests. No additional configuration is needed beyond enabling org-shared cache.
- Intermediates are invalidated when source files they depend on change (source digest tracking). Peripheral file changes do not invalidate if the analysis scope is unaffected.
- Validate: After one engineer asks about a module, have a second engineer ask a related question and observe the reduction in agent steps (visible in response metadata or gateway logs).
- Token savings: expect 50-70% token reduction for subsequent complex requests that overlap with previously computed intermediates.
For leaders
- Agent intermediates convert the most expensive AI requests (multi-step complex analysis) into cost-amortized shared assets.
- At 100+ engineers: the first few requests per day are expensive, but every subsequent similar request costs 30-50% of baseline — compounding savings as team size grows.
- Network effect: cross-team reuse means debugging work by one team immediately benefits all other teams working on related code.
- No additional infrastructure cost — intermediates use the same org-shared cache infrastructure as other fabric artifacts.
Next steps
- Learn how deterministic tool results cache repeatable tool output
- Explore how artifact freshness determines when intermediates need regeneration
- Understand how fabric provenance tracks what data each intermediate used