Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Provenance Separation: Knowledge Base vs Fabric

Keeptrusts preserves distinct provenance for every piece of context that enters a prompt, whether it originates from the Knowledge Base or the Codebase Context Fabric. This separation ensures that audit trails can trace which organizational knowledge and which codebase context influenced any given response.

Use this page when

  • You need to understand how Keeptrusts separates provenance between Knowledge Base and Fabric sources in cached responses.
  • You are configuring audit trails that must attribute each piece of context to its origin system.
  • You want to verify that provenance metadata is not blended or lost across cache fill and retrieval.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

The Provenance Envelope

Every context chunk selected during prompt assembly carries a provenance envelope. This envelope travels with the chunk through the entire request lifecycle — from selection through response generation and into the citation record.

The base provenance envelope contains:

{
"source": "knowledge_base | codebase_context_fabric",
"source_id": "unique-identifier",
"version": "version-or-timestamp",
"retrieved_at": "2026-04-30T14:22:00Z",
"relevance_score": 0.89,
"token_count": 412
}

Knowledge Base Provenance

KB provenance extends the base envelope with fields specific to the asset lifecycle:

FieldTypeDescription
asset_idstring (UUID)Unique identifier of the KB asset
asset_versionintegerVersion number of the asset at retrieval time
promotion_statusstringCurrent lifecycle status: draft, active, or archived
binding_idstring (UUID)Identifier of the binding that made this asset available
asset_titlestringHuman-readable title of the asset
promoted_atstring (ISO 8601)When this version was promoted to active
promoted_bystringUser or system that promoted the version

A full KB provenance envelope looks like:

{
"source": "knowledge_base",
"source_id": "asset-uuid-here",
"version": "3",
"retrieved_at": "2026-04-30T14:22:00Z",
"relevance_score": 0.92,
"token_count": 856,
"asset_id": "asset-uuid-here",
"asset_version": 3,
"promotion_status": "active",
"binding_id": "binding-uuid-here",
"asset_title": "Session Management Policy v3",
"promoted_at": "2026-04-15T09:00:00Z",
"promoted_by": "admin@example.com"
}

Codebase Context Fabric Provenance

Fabric provenance extends the base envelope with fields specific to indexed code:

FieldTypeDescription
workspace_idstring (UUID)Identifier of the indexed workspace
cache_keystringCache key for the fabric index entry
file_pathstringRelative file path within the workspace
symbolstringCode symbol (function, class, module) if applicable
last_indexed_atstring (ISO 8601)When the fabric index last processed this file
commit_shastringGit commit SHA at indexing time (if available)
languagestringProgramming language of the source file

A full fabric provenance envelope looks like:

{
"source": "codebase_context_fabric",
"source_id": "workspace-uuid:src/auth/session.ts:validateSession",
"version": "2026-04-30T13:45:00Z",
"retrieved_at": "2026-04-30T14:22:00Z",
"relevance_score": 0.88,
"token_count": 234,
"workspace_id": "workspace-uuid-here",
"cache_key": "ws-uuid:src/auth/session.ts:1714480200",
"file_path": "src/auth/session.ts",
"symbol": "validateSession",
"last_indexed_at": "2026-04-30T13:45:00Z",
"commit_sha": "a1b2c3d4e5f6",
"language": "typescript"
}

Provenance in Prompt Assembly

During prompt assembly, Keeptrusts inserts provenance markers around each context section. These markers are invisible to the language model but are tracked internally for attribution:

[CONTEXT_START source=knowledge_base asset_id=uuid version=3]
...KB content here...
[CONTEXT_END]

[CONTEXT_START source=codebase_context_fabric workspace_id=uuid file_path=src/auth/session.ts]
...fabric content here...
[CONTEXT_END]

The prompt assembler maintains a mapping table that associates each token range in the assembled prompt with its provenance envelope. This table is used after response generation to compute citations.

Provenance in Response Attribution

After the language model generates a response, the attribution engine analyzes which context chunks most likely influenced each span of the output. The engine produces citation records that link response text back to the originating source with the full provenance chain.

Each citation preserves:

  • The original provenance envelope of the source chunk.
  • The text span in the response that the citation supports.
  • A confidence score indicating how strongly the response text correlates with the source chunk.

This means you can trace any part of a response back to either a specific KB asset version or a specific file and symbol in a codebase.

Audit Trail Implications

Provenance separation enables several audit capabilities:

Compliance Auditing

You can query the event log to answer questions like:

  • "Which responses used KB asset X version 2 before it was updated to version 3?"
  • "How many responses referenced code from workspace Y in the last 30 days?"
  • "Did any response combine a draft KB asset with production code context?"

Version Tracking

When a KB asset is updated and a new version is promoted, you can identify all prior responses that used the old version. This supports impact analysis when policies or standards change.

Staleness Detection

Fabric provenance includes last_indexed_at, letting you identify responses where the code context may have been stale relative to the actual codebase state at response time.

Configuration

Provenance tracking is enabled by default. You can control the level of detail stored in the event log:

provenance:
store_full_envelope: true
include_in_response_metadata: true
redact_file_paths: false
retention_days: 90
FieldDefaultDescription
store_full_envelopetrueStore complete provenance envelopes in the event log
include_in_response_metadatatrueInclude provenance data in API response metadata
redact_file_pathsfalseReplace file paths with hashes in stored provenance
retention_days90How long to retain provenance records

Next steps

For AI systems

  • Canonical terms: Keeptrusts, provenance separation, provenance envelope, knowledge_base source, codebase_context_fabric source, audit trail, context markers.
  • Exact feature/config names: provenance.store_full_envelope, provenance.include_in_response_metadata, provenance.redact_file_paths, provenance.retention_days, [CONTEXT_START]/[CONTEXT_END] markers, KB fields (asset_id, asset_version, promotion_status, promoted_by), fabric fields (workspace_id, cache_key, file_path, symbol, commit_sha).
  • Best next pages: Citation and Provenance Display, Cache Keys with Mixed Context.

For engineers

  • Every context chunk carries a provenance envelope with source, source_id, version, retrieved_at, relevance_score, and token_count.
  • KB envelopes extend with: asset_id, asset_version, promotion_status, binding_id, promoted_at, promoted_by.
  • Fabric envelopes extend with: workspace_id, cache_key, file_path, symbol, last_indexed_at, commit_sha, language.
  • Provenance is enabled by default; configure store_full_envelope, include_in_response_metadata, and retention_days in YAML.
  • Use provenance for impact analysis: when a KB policy changes, identify all prior responses that used the old version.

For leaders

  • Provenance separation provides complete audit trail: trace any AI response back to the exact policy documents and code files that informed it.
  • Supports compliance requirements: prove which version of a policy was active when a response was generated.
  • Staleness detection via fabric last_indexed_at identifies responses where code context may not have reflected the actual codebase state.
  • Configurable retention (default 90 days) balances compliance needs with storage costs.