Agent Intermediate Artifacts: Reusing Agent Work

AI agents perform multi-step analysis when answering complex questions. They read files, run tools, reason through problems, and produce intermediate results before delivering a final answer. The agent_intermediate artifact caches these intermediate work products so the next agent request — from any engineer in your organization — reuses them instead of repeating the same work.

Use this page when

You want to understand how multi-step agent analysis results are cached and shared across your organization.
You need to reduce agent execution cost by reusing intermediate reasoning, tool outputs, and context assembly from prior requests.
You are evaluating the cost impact of agent_intermediate artifacts at scale (100+ engineers).

Primary audience

Primary: AI Agents, Technical Engineers
Secondary: Technical Leaders

What agent_intermediate artifacts capture

Each agent_intermediate artifact records:

Analysis step output — the result of a specific reasoning or computation step
Tool call results — outputs from tools the agent invoked during analysis
Reasoning chains — structured reasoning that led to intermediate conclusions
Partial computations — incomplete but reusable analysis that applies to future requests
Context assembly — the gathered context an agent assembled before answering

How multi-step analysis generates intermediates

When you ask your AI a complex question, the agent often performs multiple steps:

Identify relevant files and modules
Read and analyze source code
Run diagnostic tools (type checking, linting, dependency analysis)
Reason about the relationship between components
Synthesize a final answer

Each step produces intermediate results. Without caching, these results vanish after the agent responds. The next engineer who asks a similar question triggers the entire process again.

With agent_intermediate caching, steps 1-4 produce cached artifacts. The next request that needs the same intermediate results retrieves them from the cache and jumps directly to step 5.

Dramatic reduction in agent execution cost

Agent execution cost scales with the number of steps an agent performs. Each tool call, file read, and reasoning step consumes tokens and time. For complex analysis across a large codebase, a single request might involve dozens of intermediate steps.

When intermediates are cached and shared:

Subsequent requests skip completed steps — if the analysis was already done, the agent reuses it
Cross-engineer reuse — when one engineer's request generates intermediates, every other engineer benefits
Compounding savings — the more engineers use AI on the same codebase, the more intermediates accumulate

For a 100+ engineer team, this means the first few requests of the day are expensive, but every subsequent similar request is dramatically cheaper.

The org-shared cache makes intermediates available to every engineer. This creates a network effect:

Morning: Engineer A asks about the auth module. The agent reads 15 files, runs type analysis, and maps dependencies. These intermediates are cached.
Midday: Engineer B asks about session handling in the same auth module. The agent retrieves cached file analyses and dependency maps, performing only the session-specific reasoning.
Afternoon: Engineer C asks about auth module test coverage. The agent retrieves cached file summaries and immediately focuses on test analysis.

Each subsequent request costs a fraction of the first because the shared intermediates eliminate redundant work.

Types of cached intermediates

The cache stores several categories of intermediate artifacts:

Analysis intermediates

When an agent analyzes a module's architecture, it produces structured understanding of component relationships, data flow, and responsibility boundaries. This analysis applies to many future questions about the same module.

Tool output intermediates

When an agent runs tools like type checkers, linters, or dependency analyzers, the outputs are deterministic for a given source state. These outputs are cached and reused until the source changes.

Context assembly intermediates

Before answering a question, an agent assembles relevant context from across the codebase. This assembly — which files are relevant, what their roles are, how they connect — is reusable for related questions.

Reasoning chain intermediates

When an agent reasons through a complex problem, the structured reasoning steps are cached. Future requests that require the same reasoning path reuse the cached chain instead of re-deriving it.

Intermediate artifact structure

A typical agent_intermediate artifact contains:

Artifact: agent_intermediate
Repository: your-org/your-service
Scope: src/payments/
Generated: 2026-04-30T11:05:00Z
Step: dependency_analysis

Result:
  Module: src/payments/processor.ts
  Direct Dependencies: [stripe-sdk, src/auth/tokens, src/db/transactions]
  Transitive Dependencies: [src/config/secrets, src/logging/structured]
  Dependency Count: 5 direct, 12 transitive
  Circular Dependencies: none
  
Context:
  Triggered By: "How does the payment processor handle retries?"
  Reusable For: Any question about payment module dependencies,
    integration points, or refactoring impact

Freshness and invalidation

Intermediate artifacts are invalidated when the source code they analyze changes. The cache tracks which files and commits each intermediate depends on. When those files change:

Full invalidation — if the core analyzed files change, the intermediate is discarded
Partial reuse — if peripheral files change, the intermediate may still be valid for its specific analysis scope

This ensures you never get stale analysis while maximizing reuse of still-valid intermediates.

Cost impact at scale

For a 100+ engineer organization, the cost savings from intermediate reuse compound significantly:

Metric	Without intermediates	With intermediates
Steps per complex request	8-15	2-4
Token consumption	Full	30-50% of full
Response latency	10-30 seconds	3-8 seconds
Daily org token spend	Baseline	40-60% reduction

These savings grow as more engineers use AI on the same codebase, because the shared cache accumulates more reusable intermediates.

Connecting to other fabric artifacts

Agent intermediates complement other Codebase Context Fabric artifacts:

Deterministic tool results cache tool outputs separately from agent reasoning
File summaries provide pre-computed understanding that reduces the need for agent file reading
Dependency graphs eliminate agent dependency-tracing steps entirely

Together, these artifacts mean agents start with rich pre-computed context and only perform novel reasoning specific to each request.

For AI systems

Canonical terms: Keeptrusts, Codebase Context Fabric, agent_intermediate artifact, multi-step analysis, org-shared cache, intermediate reuse, agent execution cost, reasoning chain, context assembly.
Feature/config names: agent_intermediate artifact type, org-shared cache, source digest-based invalidation, partial reuse, analysis intermediates, tool output intermediates, context assembly intermediates, reasoning chain intermediates.
Best next pages: Deterministic Tool Results, Artifact Freshness, Fabric Provenance.

For engineers

Agent intermediates are cached automatically when the gateway processes multi-step agent requests. No additional configuration is needed beyond enabling org-shared cache.
Intermediates are invalidated when source files they depend on change (source digest tracking). Peripheral file changes do not invalidate if the analysis scope is unaffected.
Validate: After one engineer asks about a module, have a second engineer ask a related question and observe the reduction in agent steps (visible in response metadata or gateway logs).
Token savings: expect 50-70% token reduction for subsequent complex requests that overlap with previously computed intermediates.

For leaders

Agent intermediates convert the most expensive AI requests (multi-step complex analysis) into cost-amortized shared assets.
At 100+ engineers: the first few requests per day are expensive, but every subsequent similar request costs 30-50% of baseline — compounding savings as team size grows.
Network effect: cross-team reuse means debugging work by one team immediately benefits all other teams working on related code.
No additional infrastructure cost — intermediates use the same org-shared cache infrastructure as other fabric artifacts.

Next steps

Learn how deterministic tool results cache repeatable tool output
Explore how artifact freshness determines when intermediates need regeneration
Understand how fabric provenance tracks what data each intermediate used

Use this page when​

Primary audience​

What agent_intermediate artifacts capture​

How multi-step analysis generates intermediates​

Dramatic reduction in agent execution cost​

Sharing intermediates across the organization​

Types of cached intermediates​

Analysis intermediates​

Tool output intermediates​

Context assembly intermediates​

Reasoning chain intermediates​

Intermediate artifact structure​

Freshness and invalidation​

Cost impact at scale​

Connecting to other fabric artifacts​

For AI systems​

For engineers​

For leaders​

Next steps​