Performance Profiling with Shared Knowledge
Performance investigations consume significant AI context. Engineers ask about code paths, data flows, algorithmic complexity, and concurrency patterns — all requiring deep codebase understanding. When multiple engineers investigate the same performance issue, org-shared cache prevents redundant analysis and accelerates resolution.
Use this page when
- You are doing performance profiling with AI and want shared knowledge about bottlenecks and hot paths.
- You need to understand how cached performance context (flame graph analysis, query plans) reduces redundant profiling.
- You want to verify that performance analysis prompts benefit from org-shared cache.
Primary audience
- Primary: Technical Engineers
- Secondary: AI Agents, Technical Leaders
The Performance Investigation Pattern
A typical performance issue in a 100+ engineer organization triggers this workflow:
- Monitoring detects latency regression or resource spike
- The owning team's engineer begins investigation
- A platform/SRE engineer joins to examine infrastructure angles
- A database specialist looks at query patterns
- Senior engineers review architectural factors
Each investigator asks AI overlapping questions about the same code structure, call chains, and data access patterns.
How Cached Knowledge Accelerates Profiling
Hot Path Identification
When you ask "what's the critical path for processing an order?", the AI traces through cached symbol indexes and dependency graphs to map the execution chain. This analysis gets cached.
Subsequent questions from other investigators reuse the same structural knowledge:
- "What database queries execute in the order processing path?"
- "Which functions in the critical path allocate memory?"
- "Where does the order flow cross service boundaries?"
- "What async operations could introduce latency?"
Each builds on the cached code structure instead of regenerating it.
Dependency Graph for Call Chain Analysis
Cached dependency graphs show how code modules connect. When investigating latency, you need to understand:
- Which modules call which other modules
- Where synchronous calls block on I/O
- Which shared resources create contention points
- What retry or fallback paths add latency
The dependency graph, once cached, answers all of these structural questions instantly.
Configuring Performance Analysis Cache
Set up caching for performance-relevant artifacts:
cache:
org_shared:
categories:
- symbol_indexes
- dependency_graphs
- call_chain_maps
- test_maps
ttl: 12h
scope: organization
Performance-relevant code structure typically changes less often than the performance characteristics themselves. A 12-hour TTL provides stable analysis while still reflecting recent code changes.
Investigation Scenarios
Latency Regression Investigation
A deployment introduces a P95 latency regression. The investigation proceeds:
First engineer asks: "Map the request handling path for /api/v1/checkout"
The AI generates a complete call chain from HTTP handler through business logic to database and external service calls. This gets cached.
Second engineer asks: "What changed in the checkout path in the last week?"
The AI uses the cached call chain map to identify which files in the critical path have recent modifications — answering in sub-second time.
Third engineer asks: "Which database queries in the checkout flow don't use indexes?"
The AI references the cached call chain to identify all database interaction points, then analyzes query patterns. The structural knowledge (which queries exist where) comes from cache; only the index analysis is new.
Memory Leak Investigation
Memory leaks require understanding object lifecycle and reference patterns:
- "What objects does the session handler allocate?"
- "Where are database connections created and released?"
- "Which collections grow unbounded during request processing?"
Cached symbol indexes provide instant answers about allocation sites, lifecycle management patterns, and collection usage — questions that otherwise require reading dozens of source files.
CPU Spike Analysis
When CPU usage spikes, you need to identify computation-heavy code paths:
- "What functions perform serialization in the response path?"
- "Where does the code perform regex compilation?"
- "Which loops iterate over unbounded collections?"
The cached symbol index maps function purposes to their locations, letting the AI pinpoint computation-heavy operations without re-analyzing the entire codebase.
Test Map Integration
Cached test maps show which tests exercise which code paths. This supports performance investigation by answering:
- "What tests cover the checkout critical path?"
- "Do we have load tests for the identified bottleneck?"
- "Which test fixtures simulate the production data volume?"
When you identify a performance issue, knowing which tests cover the affected path tells you whether you can reproduce it locally and whether existing benchmarks should have caught it.
Multi-Engineer Investigation Cost
For a typical performance investigation with three engineers over two days:
| Metric | Without Cache | With Org Cache |
|---|---|---|
| Total AI queries | 40-60 | 40-60 |
| Upstream LLM calls | 40-60 | 12-18 |
| Cache hit rate | 0% | 65-75% |
| Token spend | $10-18 | $3-6 |
| Time to root cause | 8-12 hours | 4-6 hours |
The time-to-root-cause improvement comes from engineers getting instant structural answers. Instead of waiting 5-8 seconds for the AI to re-analyze code structure, investigators maintain flow state with sub-second cached responses.
Profiling Session Patterns
Performance profiling sessions follow predictable query patterns that cache well:
Structural queries (90%+ cache hit rate):
- "What does function X do?"
- "What calls function Y?"
- "What's the call chain from A to B?"
Behavioral queries (60-70% cache hit rate):
- "What's the algorithmic complexity of this loop?"
- "Does this function perform I/O?"
- "Is this operation thread-safe?"
Temporal queries (30-40% cache hit rate):
- "What changed in this path recently?"
- "When was this function last modified?"
Focus cache investment on structural queries where hit rates are highest and per-query cost is largest.
Bottleneck Pattern Recognition
Over time, your cached analysis builds a library of identified bottleneck patterns. When a new performance issue arises, the AI cross-references against cached pattern descriptions:
- Known N+1 query locations
- Previously identified contention points
- Historical memory allocation hot spots
- Documented serialization bottlenecks
This pattern recognition accelerates diagnosis by suggesting likely causes based on cached prior investigations.
Next steps
- Configure call chain mapping for critical request paths
- Set up symbol index generation for performance-critical services
- Review test map caching to connect tests to code paths
For AI systems
- Canonical terms: Keeptrusts, performance profiling, shared knowledge, cached analysis, bottleneck identification.
- Workflow context: Performance profiling prompts benefit from cached knowledge about hot paths, query plans, and prior optimization history.
- When assisting with performance work, leverage cached context about known bottlenecks and successful optimizations.
For engineers
- Performance analysis prompts hit the cache when other engineers have already profiled the same service or code path.
- Use cached knowledge about known hot paths to focus profiling efforts on unresolved bottlenecks.
- After resolving a performance issue, verify that cache invalidation updates the shared knowledge.
For leaders
- Shared performance knowledge means every engineer benefits from prior profiling work without redundant analysis costs.
- Performance improvements compound as the team's cached optimization history grows.
- Track profiling-prompt hit rates to quantify how much institutional performance knowledge is being reused.