Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Advanced Chat Patterns

As your organization's AI chat usage matures, you encounter scenarios that require more sophisticated governance patterns. This guide covers advanced techniques for managing complex conversations, chain-of-thought reasoning, context windows, and conversation branching within the Keeptrusts policy framework.

Use this page when

  • You need to govern chain-of-thought reasoning steps produced by LLMs in your chat sessions.
  • You are designing multi-turn conversation policies that accumulate context across turns.
  • You want to manage context window growth, summarize prior turns, or enforce turn limits.
  • You are implementing conversation branching for exploratory analysis workflows.

Primary audience

  • Primary: AI Engineers building multi-turn chat applications, Platform Engineers configuring advanced policies
  • Secondary: Technical Leaders evaluating governance complexity for mature chat deployments

Chain-of-Thought Governance

Chain-of-thought (CoT) prompting encourages the LLM to show its reasoning step by step. In a governed environment, each reasoning step is subject to policy evaluation.

How CoT Interacts with Policies

When the LLM produces chain-of-thought reasoning:

  1. The full output — including intermediate reasoning steps — passes through output policies.
  2. Policies evaluate both the reasoning chain and the final answer.
  3. If an intermediate step contains policy-violating content, the entire response may be blocked or redacted.

Configuring CoT-Aware Policies

To support chain-of-thought while maintaining governance:

pack:
name: advanced-chat-patterns-example-1
version: 1.0.0
enabled: true
policies:
chain:
- output
policy:
output: {}

This configuration allows policies to redact sensitive content from reasoning steps while preserving the final answer, provided the answer itself passes policy evaluation.

Best Practices for CoT Under Governance

PracticeRationale
Use structured CoT promptsSeparates reasoning from conclusions for targeted policy evaluation
Request reasoning in a specific formatMakes policy pattern matching more reliable
Test CoT with your active policiesEnsures reasoning steps do not trigger false positives
Monitor token usageCoT responses use significantly more tokens

Example: Structured CoT Prompt

Analyze the following contract clause for compliance risks.

## Reasoning
Walk through your analysis step by step, identifying specific
risk factors and citing relevant regulations.

## Conclusion
Provide a risk rating (Low/Medium/High) and a one-paragraph summary.

Structuring the output this way allows policies to evaluate the reasoning and conclusion sections independently.

Multi-Turn Conversation Policies

Multi-turn policies evaluate the full conversation context, not just individual messages. This enables governance patterns that account for conversational dynamics.

Conversation-Level Policy Types

Policy TypeDescription
Topic drift detectionFlags when conversation strays from the initial topic into restricted areas
Cumulative PII detectionDetects PII spread across multiple messages that individually seem benign
Escalation patternsTriggers escalation when repeated policy-adjacent prompts suggest probing
Context poisoning detectionIdentifies attempts to gradually shift the assistant's behavior
Turn-count limitsRestricts conversation length to manage cost and context quality

Configuring Multi-Turn Policies

pack:
name: advanced-chat-patterns-example-2
version: 1.0.0
enabled: true
policies:
chain:
- conversation
policy:
conversation: {}

Topic Drift Detection

Topic drift policies monitor the semantic similarity between the conversation's initial topic and subsequent messages:

  1. The first few messages establish the topic baseline.
  2. Each subsequent message is compared against the baseline.
  3. If similarity drops below the threshold, the policy intervenes.

This prevents users from starting an approved conversation and gradually steering it toward restricted content.

Cumulative PII Detection

Individual messages may not contain PII, but across multiple turns, a user might reveal:

  • A name in turn 1
  • An email address in turn 3
  • A phone number in turn 5

Conversation-scoped PII detection aggregates entities across turns and triggers when the cumulative PII exceeds the configured threshold.

Context Window Management

LLMs have finite context windows. As conversations grow, managing what stays in context becomes critical for both quality and cost.

The Context Window Challenge

Each message in a multi-turn conversation consumes context space:

ComponentToken Impact
System promptFixed overhead per message
Knowledge assetsVariable, depends on recall
Conversation historyGrows with each turn
Current promptVariable per message
Reserved for responseSet by max_tokens parameter

When the total exceeds the model's context window, older messages must be truncated or summarized.

Context Management Strategies

Sliding Window

Keep only the most recent N turns in context:

chat:
context:
strategy: sliding_window
window_size: 20

Pros: Simple, predictable token usage. Cons: Loses early conversation context.

Summarization

Periodically summarize older turns into a condensed representation:

chat:
context:
strategy: summarize
summarize_after: 10
summary_model: gpt-4o-mini

Pros: Preserves key information from earlier turns. Cons: Adds latency and cost for the summarization step.

Selective Retention

Keep specific high-value turns (e.g., initial instructions, key decisions) while dropping routine exchanges:

chat:
context:
strategy: selective
always_retain:
- first_turn
- pinned_turns
drop_after: 15

Pros: Retains the most important context. Cons: Requires configuration to identify high-value turns.

Monitoring Context Usage

Track context window utilization in the console:

  1. Navigate to Events and select a chat event.
  2. Review the context_tokens field to see how much context was used.
  3. Compare against the model's maximum context window.
  4. Set alerts for conversations approaching context limits.

Conversation Branching

Conversation branching allows users to explore alternative paths from a specific point in a conversation without losing the original thread.

How Branching Works

  1. A user reaches a point in the conversation where they want to explore an alternative approach.
  2. They create a branch from that message.
  3. The branch starts a new conversation thread that shares history up to the branch point.
  4. The original conversation remains unchanged.

Use Cases for Branching

ScenarioBenefit
Exploring different analysis approachesCompare outcomes without losing the original
Testing policy boundariesSee how different phrasings affect governance
A/B testing promptsEvaluate which prompt produces better results
Iterative refinementBranch to try variations while keeping the best path

Creating a Branch

In the Chat Workbench:

  1. Hover over the message you want to branch from.
  2. Click the Branch icon.
  3. A new conversation tab opens with the shared history.
  4. Continue the conversation in the new branch.

Branch Governance

Branches inherit the governance context of the parent conversation:

  • Policy evaluations consider the full history up to the branch point.
  • Each branch is an independent conversation from the governance perspective after the branch point.
  • Decision events for branches include a reference to the parent conversation.

Combining Advanced Patterns

These patterns work together for sophisticated use cases:

Governed Research Workflow

  1. Start a conversation with a broad research question.
  2. Use chain-of-thought prompting for detailed analysis.
  3. Branch the conversation to explore alternative hypotheses.
  4. Multi-turn policies prevent topic drift into restricted areas.
  5. Context management keeps the conversation focused as it grows.
  6. Export the full conversation tree for documentation.

Compliance Review Workflow

  1. Begin with a structured compliance question referencing bound knowledge assets.
  2. The system prompt enforces formal output formatting.
  3. Multi-turn policies detect cumulative PII across review turns.
  4. Branch to explore different regulatory interpretations.
  5. Export branches and their policy evaluations as compliance evidence.

Performance Considerations

Advanced patterns add overhead:

PatternOverheadMitigation
CoT governanceLonger outputs = more policy evaluation timeUse structured formats for efficient evaluation
Multi-turn policiesFull conversation re-evaluation per turnCache policy results for unchanged turns
Context summarizationExtra LLM call for summarizationUse a fast, cheap model for summaries
BranchingMultiple conversation histories to maintainSet branch limits per conversation

Best Practices

PracticeWhy It Matters
Start with simple patterns and add complexityAvoid over-engineering governance
Monitor token costs for advanced patternsCoT and branching increase costs significantly
Test multi-turn policies with realistic conversationsSynthetic tests miss real-world conversation dynamics
Set conversation turn limitsPrevents unbounded context growth and cost
Document branching decisionsCreates an audit trail for exploratory analysis
Review context management strategy quarterlyAdjust as models offer larger context windows

Next steps

For AI systems

For engineers

  • Configure evaluate_reasoning_steps: true in output policy YAML to enable per-step CoT evaluation.
  • Test multi-turn policies with realistic 10+ turn conversations — synthetic 2-turn tests miss accumulation effects.
  • Set explicit max_turns in gateway config to prevent unbounded context growth.
  • Monitor token usage when enabling CoT — expect 3-5x token consumption compared to direct answers.
  • Validate context summarization by comparing gateway decision events before and after summarization triggers.

For leaders

  • Chain-of-thought and branching patterns significantly increase token costs (3-5x per interaction).
  • Multi-turn policies require ongoing tuning as conversation patterns evolve — plan for quarterly policy reviews.
  • Conversation branching creates audit trail complexity; define retention and export policies before enabling.
  • Advanced patterns increase policy evaluation latency — factor this into SLA commitments for interactive chat.