Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Agent Intermediate Artifacts: Reusing Agent Work

AI agents perform multi-step analysis when answering complex questions. They read files, run tools, reason through problems, and produce intermediate results before delivering a final answer. The agent_intermediate artifact caches these intermediate work products so the next agent request — from any engineer in your organization — reuses them instead of repeating the same work.

Use this page when

  • You want to understand how multi-step agent analysis results are cached and shared across your organization.
  • You need to reduce agent execution cost by reusing intermediate reasoning, tool outputs, and context assembly from prior requests.
  • You are evaluating the cost impact of agent_intermediate artifacts at scale (100+ engineers).

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

What agent_intermediate artifacts capture

Each agent_intermediate artifact records:

  • Analysis step output — the result of a specific reasoning or computation step
  • Tool call results — outputs from tools the agent invoked during analysis
  • Reasoning chains — structured reasoning that led to intermediate conclusions
  • Partial computations — incomplete but reusable analysis that applies to future requests
  • Context assembly — the gathered context an agent assembled before answering

How multi-step analysis generates intermediates

When you ask your AI a complex question, the agent often performs multiple steps:

  1. Identify relevant files and modules
  2. Read and analyze source code
  3. Run diagnostic tools (type checking, linting, dependency analysis)
  4. Reason about the relationship between components
  5. Synthesize a final answer

Each step produces intermediate results. Without caching, these results vanish after the agent responds. The next engineer who asks a similar question triggers the entire process again.

With agent_intermediate caching, steps 1-4 produce cached artifacts. The next request that needs the same intermediate results retrieves them from the cache and jumps directly to step 5.

Dramatic reduction in agent execution cost

Agent execution cost scales with the number of steps an agent performs. Each tool call, file read, and reasoning step consumes tokens and time. For complex analysis across a large codebase, a single request might involve dozens of intermediate steps.

When intermediates are cached and shared:

  • Subsequent requests skip completed steps — if the analysis was already done, the agent reuses it
  • Cross-engineer reuse — when one engineer's request generates intermediates, every other engineer benefits
  • Compounding savings — the more engineers use AI on the same codebase, the more intermediates accumulate

For a 100+ engineer team, this means the first few requests of the day are expensive, but every subsequent similar request is dramatically cheaper.

Sharing intermediates across the organization

The org-shared cache makes intermediates available to every engineer. This creates a network effect:

  • Morning: Engineer A asks about the auth module. The agent reads 15 files, runs type analysis, and maps dependencies. These intermediates are cached.
  • Midday: Engineer B asks about session handling in the same auth module. The agent retrieves cached file analyses and dependency maps, performing only the session-specific reasoning.
  • Afternoon: Engineer C asks about auth module test coverage. The agent retrieves cached file summaries and immediately focuses on test analysis.

Each subsequent request costs a fraction of the first because the shared intermediates eliminate redundant work.

Types of cached intermediates

The cache stores several categories of intermediate artifacts:

Analysis intermediates

When an agent analyzes a module's architecture, it produces structured understanding of component relationships, data flow, and responsibility boundaries. This analysis applies to many future questions about the same module.

Tool output intermediates

When an agent runs tools like type checkers, linters, or dependency analyzers, the outputs are deterministic for a given source state. These outputs are cached and reused until the source changes.

Context assembly intermediates

Before answering a question, an agent assembles relevant context from across the codebase. This assembly — which files are relevant, what their roles are, how they connect — is reusable for related questions.

Reasoning chain intermediates

When an agent reasons through a complex problem, the structured reasoning steps are cached. Future requests that require the same reasoning path reuse the cached chain instead of re-deriving it.

Intermediate artifact structure

A typical agent_intermediate artifact contains:

Artifact: agent_intermediate
Repository: your-org/your-service
Scope: src/payments/
Generated: 2026-04-30T11:05:00Z
Step: dependency_analysis

Result:
Module: src/payments/processor.ts
Direct Dependencies: [stripe-sdk, src/auth/tokens, src/db/transactions]
Transitive Dependencies: [src/config/secrets, src/logging/structured]
Dependency Count: 5 direct, 12 transitive
Circular Dependencies: none

Context:
Triggered By: "How does the payment processor handle retries?"
Reusable For: Any question about payment module dependencies,
integration points, or refactoring impact

Freshness and invalidation

Intermediate artifacts are invalidated when the source code they analyze changes. The cache tracks which files and commits each intermediate depends on. When those files change:

  • Full invalidation — if the core analyzed files change, the intermediate is discarded
  • Partial reuse — if peripheral files change, the intermediate may still be valid for its specific analysis scope

This ensures you never get stale analysis while maximizing reuse of still-valid intermediates.

Cost impact at scale

For a 100+ engineer organization, the cost savings from intermediate reuse compound significantly:

MetricWithout intermediatesWith intermediates
Steps per complex request8-152-4
Token consumptionFull30-50% of full
Response latency10-30 seconds3-8 seconds
Daily org token spendBaseline40-60% reduction

These savings grow as more engineers use AI on the same codebase, because the shared cache accumulates more reusable intermediates.

Connecting to other fabric artifacts

Agent intermediates complement other Codebase Context Fabric artifacts:

  • Deterministic tool results cache tool outputs separately from agent reasoning
  • File summaries provide pre-computed understanding that reduces the need for agent file reading
  • Dependency graphs eliminate agent dependency-tracing steps entirely

Together, these artifacts mean agents start with rich pre-computed context and only perform novel reasoning specific to each request.

For AI systems

  • Canonical terms: Keeptrusts, Codebase Context Fabric, agent_intermediate artifact, multi-step analysis, org-shared cache, intermediate reuse, agent execution cost, reasoning chain, context assembly.
  • Feature/config names: agent_intermediate artifact type, org-shared cache, source digest-based invalidation, partial reuse, analysis intermediates, tool output intermediates, context assembly intermediates, reasoning chain intermediates.
  • Best next pages: Deterministic Tool Results, Artifact Freshness, Fabric Provenance.

For engineers

  • Agent intermediates are cached automatically when the gateway processes multi-step agent requests. No additional configuration is needed beyond enabling org-shared cache.
  • Intermediates are invalidated when source files they depend on change (source digest tracking). Peripheral file changes do not invalidate if the analysis scope is unaffected.
  • Validate: After one engineer asks about a module, have a second engineer ask a related question and observe the reduction in agent steps (visible in response metadata or gateway logs).
  • Token savings: expect 50-70% token reduction for subsequent complex requests that overlap with previously computed intermediates.

For leaders

  • Agent intermediates convert the most expensive AI requests (multi-step complex analysis) into cost-amortized shared assets.
  • At 100+ engineers: the first few requests per day are expensive, but every subsequent similar request costs 30-50% of baseline — compounding savings as team size grows.
  • Network effect: cross-team reuse means debugging work by one team immediately benefits all other teams working on related code.
  • No additional infrastructure cost — intermediates use the same org-shared cache infrastructure as other fabric artifacts.

Next steps