Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Fabric Provenance and Audit Trail

Every artifact in the Codebase Context Fabric carries provenance metadata that records exactly what data was used to produce it. You can trace any cached artifact back to its source inputs, verify its integrity, and audit how it contributed to AI decisions — all without storing raw prompt content, response text, or source code in audit records.

Use this page when

  • You need to understand how fabric artifacts carry provenance metadata for auditability (source digests, generator version, timestamps).
  • You are building audit trails that trace AI decisions back to source inputs without storing raw content.
  • You need to verify artifact integrity or demonstrate data lineage for compliance.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

What provenance records

Each fabric artifact includes provenance metadata with:

  • Source repository — which repository the artifact was generated from
  • Commit reference — the exact commit SHA at generation time
  • Input file digests — SHA-256 hashes of every file that contributed to the artifact
  • Generator identity — which agent version and configuration produced the artifact
  • Generation timestamp — when the artifact was created
  • Configuration digest — hash of the policy and cache configuration in effect
  • Parent artifacts — references to any other fabric artifacts used as inputs

Why provenance matters

In regulated environments and large engineering organizations, you need to answer questions like:

  • "What data was the AI working with when it suggested this change?"
  • "Which version of the code did the cached analysis reflect?"
  • "Has the source data changed since this artifact was generated?"
  • "Who or what produced this cached result?"

Provenance metadata answers all of these questions without requiring you to store sensitive source code or conversation content in audit logs.

Provenance in decision events

When your AI uses a fabric artifact to answer a question, the decision event records which artifacts were consulted. This creates a complete audit chain:

  1. Engineer asks a question → decision event created
  2. AI retrieves fabric artifacts → artifact references recorded in the event
  3. Each artifact has provenance → source digests and generation metadata available
  4. Audit query → trace from decision through artifacts to original source state

This chain gives you full visibility into AI behavior without storing the actual prompts or responses.

No raw content in audit records

A critical design principle: provenance records contain only metadata, never raw content. Audit records include:

  • Included: file paths, content digests, commit SHAs, timestamps, artifact IDs, generator versions
  • Not included: file contents, prompt text, response text, source code, conversation history

This separation means your audit trail is safe to retain long-term, share with compliance teams, and store in systems that should not hold source code or intellectual property.

Provenance structure

A typical provenance record attached to a fabric artifact:

Provenance:
artifact_id: fab_8a3b2c1d
artifact_type: file_summary
repository: your-org/your-service
branch: main
commit_sha: abc1234def5678
generated_at: 2026-04-30T14:22:00Z

generator:
agent_version: 2.4.1
config_digest: sha256:f1e2d3...
policy_digest: sha256:a4b5c6...

inputs:
- path: src/auth/session.ts
digest: sha256:7890ab...
size_bytes: 8432
- path: src/auth/types.ts
digest: sha256:cdef01...
size_bytes: 2104

parent_artifacts:
- fab_2e4f6a8b (dependency_graph for src/auth/)

entitlement_digest: sha256:112233...

Verifying artifact integrity

You can verify any artifact's integrity by recomputing its input digests:

  1. Check out the recorded commit SHA
  2. Compute SHA-256 of each listed input file
  3. Compare against the recorded digests
  4. If all match, the artifact was generated from the claimed source state

This verification is fully deterministic and does not require access to the cached artifact content itself.

Provenance for compliance

Organizations with compliance requirements use fabric provenance to demonstrate:

  • Data lineage — exactly what source data influenced AI outputs
  • Temporal accuracy — that AI worked with the code state at a specific point in time
  • Configuration compliance — that organizational policies were in effect during generation
  • Access control — that artifacts were generated within authorized entitlement boundaries

Provenance across artifact types

Every artifact type in the Codebase Context Fabric includes provenance, but the inputs vary:

Artifact TypeProvenance Inputs
repo_mapDirectory tree state, file metadata
file_summarySpecific file content digest
dependency_graphImport statement digests across tracked files
recent_change_summaryCommit metadata, diff statistics
known_failure_fingerprintError output digests, resolution metadata
tool_resultTool binary hash, input file digests, config hash
agent_intermediateSource file digests, parent artifact references

Querying the audit trail

You query fabric provenance through decision events. Each decision event references the artifacts that contributed to it. From there, you can:

  • Trace forward — from a source file change, find all artifacts that depend on it
  • Trace backward — from a decision event, find all source inputs through artifact provenance
  • Scope by time — find all artifacts generated within a time window
  • Filter by generator — find artifacts produced by a specific agent version

Provenance and artifact freshness

Provenance metadata is the foundation of freshness evaluation. The cache uses recorded input digests to determine whether an artifact is still valid:

  • Compare recorded file digests against current file state
  • Compare recorded config digest against current configuration
  • Compare recorded agent version against current agent version

If any recorded input has changed, the artifact is stale. Provenance makes freshness evaluation precise and efficient.

Retention and archival

Provenance records follow your organization's retention policy:

  • Active artifacts — provenance is always available for fresh cached artifacts
  • Stale artifacts — provenance is retained for audit purposes even after the artifact is regenerated
  • Archived provenance — historical provenance records are retained per your compliance requirements

You control retention duration through cache configuration. Provenance records are lightweight (metadata only) and suitable for long-term retention.

For AI systems

  • Canonical terms: Keeptrusts, Codebase Context Fabric, fabric provenance, audit trail, source digests, generator identity, provenance metadata, decision events, data lineage, artifact integrity, no raw content in audit records.
  • Feature/config names: artifact_id, artifact_type, commit_sha, generated_at, generator.agent_version, generator.config_digest, generator.policy_digest, inputs[].path, inputs[].digest, parent_artifacts, entitlement_digest, retention policy.
  • Best next pages: Artifact Freshness, Recent Change Summaries, Fabric Slices Reduce Prompts.

For engineers

  • Every fabric artifact includes provenance: source repo, commit SHA, input file digests, generator version, config digest, parent artifacts.
  • Audit records contain ONLY metadata (paths, digests, timestamps, IDs) — never raw source code, prompts, or responses.
  • Verify integrity: check out the recorded commit SHA, recompute SHA-256 of listed input files, compare against recorded digests.
  • Provenance drives freshness evaluation: the cache uses recorded input digests to determine artifact validity without re-reading files.

For leaders

  • Provenance provides full audit chain: trace from any AI decision → fabric artifacts consulted → original source inputs (all without storing sensitive content).
  • Compliance value: demonstrates data lineage, temporal accuracy, configuration compliance, and access control boundaries.
  • No raw content in audit records means the audit trail is safe for long-term retention, compliance team access, and cross-system storage.
  • Retention is configurable per org and lightweight (metadata only) — suitable for multi-year compliance retention requirements.

Next steps