Performance Profiling with Shared Knowledge

Performance investigations consume significant AI context. Engineers ask about code paths, data flows, algorithmic complexity, and concurrency patterns — all requiring deep codebase understanding. When multiple engineers investigate the same performance issue, org-shared cache prevents redundant analysis and accelerates resolution.

Use this page when

You are doing performance profiling with AI and want shared knowledge about bottlenecks and hot paths.
You need to understand how cached performance context (flame graph analysis, query plans) reduces redundant profiling.
You want to verify that performance analysis prompts benefit from org-shared cache.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

The Performance Investigation Pattern

A typical performance issue in a 100+ engineer organization triggers this workflow:

Monitoring detects latency regression or resource spike
The owning team's engineer begins investigation
A platform/SRE engineer joins to examine infrastructure angles
A database specialist looks at query patterns
Senior engineers review architectural factors

Each investigator asks AI overlapping questions about the same code structure, call chains, and data access patterns.

How Cached Knowledge Accelerates Profiling

Hot Path Identification

When you ask "what's the critical path for processing an order?", the AI traces through cached symbol indexes and dependency graphs to map the execution chain. This analysis gets cached.

Subsequent questions from other investigators reuse the same structural knowledge:

"What database queries execute in the order processing path?"
"Which functions in the critical path allocate memory?"
"Where does the order flow cross service boundaries?"
"What async operations could introduce latency?"

Each builds on the cached code structure instead of regenerating it.

Dependency Graph for Call Chain Analysis

Cached dependency graphs show how code modules connect. When investigating latency, you need to understand:

Which modules call which other modules
Where synchronous calls block on I/O
Which shared resources create contention points
What retry or fallback paths add latency

The dependency graph, once cached, answers all of these structural questions instantly.

Configuring Performance Analysis Cache

Set up caching for performance-relevant artifacts:

cache:
  org_shared:
    categories:
      - symbol_indexes
      - dependency_graphs
      - call_chain_maps
      - test_maps
    ttl: 12h
    scope: organization

Performance-relevant code structure typically changes less often than the performance characteristics themselves. A 12-hour TTL provides stable analysis while still reflecting recent code changes.

Investigation Scenarios

Latency Regression Investigation

A deployment introduces a P95 latency regression. The investigation proceeds:

First engineer asks: "Map the request handling path for /api/v1/checkout"

The AI generates a complete call chain from HTTP handler through business logic to database and external service calls. This gets cached.

Second engineer asks: "What changed in the checkout path in the last week?"

The AI uses the cached call chain map to identify which files in the critical path have recent modifications — answering in sub-second time.

Third engineer asks: "Which database queries in the checkout flow don't use indexes?"

The AI references the cached call chain to identify all database interaction points, then analyzes query patterns. The structural knowledge (which queries exist where) comes from cache; only the index analysis is new.

Memory Leak Investigation

Memory leaks require understanding object lifecycle and reference patterns:

"What objects does the session handler allocate?"
"Where are database connections created and released?"
"Which collections grow unbounded during request processing?"

Cached symbol indexes provide instant answers about allocation sites, lifecycle management patterns, and collection usage — questions that otherwise require reading dozens of source files.

CPU Spike Analysis

When CPU usage spikes, you need to identify computation-heavy code paths:

"What functions perform serialization in the response path?"
"Where does the code perform regex compilation?"
"Which loops iterate over unbounded collections?"

The cached symbol index maps function purposes to their locations, letting the AI pinpoint computation-heavy operations without re-analyzing the entire codebase.

Test Map Integration

Cached test maps show which tests exercise which code paths. This supports performance investigation by answering:

"What tests cover the checkout critical path?"
"Do we have load tests for the identified bottleneck?"
"Which test fixtures simulate the production data volume?"

When you identify a performance issue, knowing which tests cover the affected path tells you whether you can reproduce it locally and whether existing benchmarks should have caught it.

Multi-Engineer Investigation Cost

For a typical performance investigation with three engineers over two days:

Metric	Without Cache	With Org Cache
Total AI queries	40-60	40-60
Upstream LLM calls	40-60	12-18
Cache hit rate	0%	65-75%
Token spend	$10-18	$3-6
Time to root cause	8-12 hours	4-6 hours

The time-to-root-cause improvement comes from engineers getting instant structural answers. Instead of waiting 5-8 seconds for the AI to re-analyze code structure, investigators maintain flow state with sub-second cached responses.

Profiling Session Patterns

Performance profiling sessions follow predictable query patterns that cache well:

Structural queries (90%+ cache hit rate):

"What does function X do?"
"What calls function Y?"
"What's the call chain from A to B?"

Behavioral queries (60-70% cache hit rate):

"What's the algorithmic complexity of this loop?"
"Does this function perform I/O?"
"Is this operation thread-safe?"

Temporal queries (30-40% cache hit rate):

"What changed in this path recently?"
"When was this function last modified?"

Focus cache investment on structural queries where hit rates are highest and per-query cost is largest.

Bottleneck Pattern Recognition

Over time, your cached analysis builds a library of identified bottleneck patterns. When a new performance issue arises, the AI cross-references against cached pattern descriptions:

Known N+1 query locations
Previously identified contention points
Historical memory allocation hot spots
Documented serialization bottlenecks

This pattern recognition accelerates diagnosis by suggesting likely causes based on cached prior investigations.

Next steps

Configure call chain mapping for critical request paths
Set up symbol index generation for performance-critical services
Review test map caching to connect tests to code paths

For AI systems

Canonical terms: Keeptrusts, performance profiling, shared knowledge, cached analysis, bottleneck identification.
Workflow context: Performance profiling prompts benefit from cached knowledge about hot paths, query plans, and prior optimization history.
When assisting with performance work, leverage cached context about known bottlenecks and successful optimizations.

For engineers

Performance analysis prompts hit the cache when other engineers have already profiled the same service or code path.
Use cached knowledge about known hot paths to focus profiling efforts on unresolved bottlenecks.
After resolving a performance issue, verify that cache invalidation updates the shared knowledge.

For leaders

Shared performance knowledge means every engineer benefits from prior profiling work without redundant analysis costs.
Performance improvements compound as the team's cached optimization history grows.
Track profiling-prompt hit rates to quantify how much institutional performance knowledge is being reused.

Use this page when​

Primary audience​

The Performance Investigation Pattern​

How Cached Knowledge Accelerates Profiling​

Hot Path Identification​

Dependency Graph for Call Chain Analysis​

Configuring Performance Analysis Cache​

Investigation Scenarios​

Latency Regression Investigation​

Memory Leak Investigation​

CPU Spike Analysis​

Test Map Integration​

Multi-Engineer Investigation Cost​

Profiling Session Patterns​

Bottleneck Pattern Recognition​

Next steps​

For AI systems​

For engineers​

For leaders​