Advanced Chat Patterns

As your organization's AI chat usage matures, you encounter scenarios that require more sophisticated governance patterns. This guide covers advanced techniques for managing complex conversations, chain-of-thought reasoning, context windows, and conversation branching within the Keeptrusts policy framework.

Use this page when

You need to govern chain-of-thought reasoning steps produced by LLMs in your chat sessions.
You are designing multi-turn conversation policies that accumulate context across turns.
You want to manage context window growth, summarize prior turns, or enforce turn limits.
You are implementing conversation branching for exploratory analysis workflows.

Primary audience

Primary: AI Engineers building multi-turn chat applications, Platform Engineers configuring advanced policies
Secondary: Technical Leaders evaluating governance complexity for mature chat deployments

Chain-of-Thought Governance

Chain-of-thought (CoT) prompting encourages the LLM to show its reasoning step by step. In a governed environment, each reasoning step is subject to policy evaluation.

How CoT Interacts with Policies

When the LLM produces chain-of-thought reasoning:

The full output — including intermediate reasoning steps — passes through output policies.
Policies evaluate both the reasoning chain and the final answer.
If an intermediate step contains policy-violating content, the entire response may be blocked or redacted.

Configuring CoT-Aware Policies

To support chain-of-thought while maintaining governance:

pack:
  name: advanced-chat-patterns-example-1
  version: 1.0.0
  enabled: true
policies:
  chain:
  - output
policy:
  output: {}

This configuration allows policies to redact sensitive content from reasoning steps while preserving the final answer, provided the answer itself passes policy evaluation.

Best Practices for CoT Under Governance

Practice	Rationale
Use structured CoT prompts	Separates reasoning from conclusions for targeted policy evaluation
Request reasoning in a specific format	Makes policy pattern matching more reliable
Test CoT with your active policies	Ensures reasoning steps do not trigger false positives
Monitor token usage	CoT responses use significantly more tokens

Example: Structured CoT Prompt

Analyze the following contract clause for compliance risks.

## Reasoning
Walk through your analysis step by step, identifying specific
risk factors and citing relevant regulations.

## Conclusion
Provide a risk rating (Low/Medium/High) and a one-paragraph summary.

Structuring the output this way allows policies to evaluate the reasoning and conclusion sections independently.

Multi-Turn Conversation Policies

Multi-turn policies evaluate the full conversation context, not just individual messages. This enables governance patterns that account for conversational dynamics.

Conversation-Level Policy Types

Policy Type	Description
Topic drift detection	Flags when conversation strays from the initial topic into restricted areas
Cumulative PII detection	Detects PII spread across multiple messages that individually seem benign
Escalation patterns	Triggers escalation when repeated policy-adjacent prompts suggest probing
Context poisoning detection	Identifies attempts to gradually shift the assistant's behavior
Turn-count limits	Restricts conversation length to manage cost and context quality

Configuring Multi-Turn Policies

pack:
  name: advanced-chat-patterns-example-2
  version: 1.0.0
  enabled: true
policies:
  chain:
  - conversation
policy:
  conversation: {}

Topic Drift Detection

Topic drift policies monitor the semantic similarity between the conversation's initial topic and subsequent messages:

The first few messages establish the topic baseline.
Each subsequent message is compared against the baseline.
If similarity drops below the threshold, the policy intervenes.

This prevents users from starting an approved conversation and gradually steering it toward restricted content.

Cumulative PII Detection

Individual messages may not contain PII, but across multiple turns, a user might reveal:

A name in turn 1
An email address in turn 3
A phone number in turn 5

Conversation-scoped PII detection aggregates entities across turns and triggers when the cumulative PII exceeds the configured threshold.

Context Window Management

LLMs have finite context windows. As conversations grow, managing what stays in context becomes critical for both quality and cost.

The Context Window Challenge

Each message in a multi-turn conversation consumes context space:

Component	Token Impact
System prompt	Fixed overhead per message
Knowledge assets	Variable, depends on recall
Conversation history	Grows with each turn
Current prompt	Variable per message
Reserved for response	Set by max_tokens parameter

When the total exceeds the model's context window, older messages must be truncated or summarized.

Context Management Strategies

Sliding Window

Keep only the most recent N turns in context:

chat:
  context:
    strategy: sliding_window
    window_size: 20

Pros: Simple, predictable token usage. Cons: Loses early conversation context.

Summarization

Periodically summarize older turns into a condensed representation:

chat:
  context:
    strategy: summarize
    summarize_after: 10
    summary_model: gpt-4o-mini

Pros: Preserves key information from earlier turns. Cons: Adds latency and cost for the summarization step.

Selective Retention

Keep specific high-value turns (e.g., initial instructions, key decisions) while dropping routine exchanges:

chat:
  context:
    strategy: selective
    always_retain:
      - first_turn
      - pinned_turns
    drop_after: 15

Pros: Retains the most important context. Cons: Requires configuration to identify high-value turns.

Monitoring Context Usage

Track context window utilization in the console:

Navigate to Events and select a chat event.
Review the context_tokens field to see how much context was used.
Compare against the model's maximum context window.
Set alerts for conversations approaching context limits.

Conversation Branching

Conversation branching allows users to explore alternative paths from a specific point in a conversation without losing the original thread.

How Branching Works

A user reaches a point in the conversation where they want to explore an alternative approach.
They create a branch from that message.
The branch starts a new conversation thread that shares history up to the branch point.
The original conversation remains unchanged.

Use Cases for Branching

Scenario	Benefit
Exploring different analysis approaches	Compare outcomes without losing the original
Testing policy boundaries	See how different phrasings affect governance
A/B testing prompts	Evaluate which prompt produces better results
Iterative refinement	Branch to try variations while keeping the best path

Creating a Branch

In the Chat Workbench:

Hover over the message you want to branch from.
Click the Branch icon.
A new conversation tab opens with the shared history.
Continue the conversation in the new branch.

Branch Governance

Branches inherit the governance context of the parent conversation:

Policy evaluations consider the full history up to the branch point.
Each branch is an independent conversation from the governance perspective after the branch point.
Decision events for branches include a reference to the parent conversation.

Combining Advanced Patterns

These patterns work together for sophisticated use cases:

Governed Research Workflow

Start a conversation with a broad research question.
Use chain-of-thought prompting for detailed analysis.
Branch the conversation to explore alternative hypotheses.
Multi-turn policies prevent topic drift into restricted areas.
Context management keeps the conversation focused as it grows.
Export the full conversation tree for documentation.

Compliance Review Workflow

Begin with a structured compliance question referencing bound knowledge assets.
The system prompt enforces formal output formatting.
Multi-turn policies detect cumulative PII across review turns.
Branch to explore different regulatory interpretations.
Export branches and their policy evaluations as compliance evidence.

Performance Considerations

Advanced patterns add overhead:

Pattern	Overhead	Mitigation
CoT governance	Longer outputs = more policy evaluation time	Use structured formats for efficient evaluation
Multi-turn policies	Full conversation re-evaluation per turn	Cache policy results for unchanged turns
Context summarization	Extra LLM call for summarization	Use a fast, cheap model for summaries
Branching	Multiple conversation histories to maintain	Set branch limits per conversation

Best Practices

Practice	Why It Matters
Start with simple patterns and add complexity	Avoid over-engineering governance
Monitor token costs for advanced patterns	CoT and branching increase costs significantly
Test multi-turn policies with realistic conversations	Synthetic tests miss real-world conversation dynamics
Set conversation turn limits	Prevents unbounded context growth and cost
Document branching decisions	Creates an audit trail for exploratory analysis
Review context management strategy quarterly	Adjust as models offer larger context windows

Next steps

Return to the basics in Getting Started with the Chat Workbench.
Explore how these patterns interact with knowledge assets in Knowledge-Grounded Chat Conversations.
Configure the parameters that control these patterns in Customizing the Chat Experience.
Track the impact of advanced patterns on usage in Chat Analytics & Usage Insights.

For AI systems

Canonical terms: chain-of-thought governance, multi-turn policies, context window management, conversation branching, CoT-aware redaction, turn-level evaluation.
Config names: evaluate_reasoning_steps, redact_intermediate, preserve_final_answer, max_turns, context_summarization.
Best next pages: Getting Started with the Chat Workbench, Customizing the Chat Experience, Chat Analytics.

For engineers

Configure evaluate_reasoning_steps: true in output policy YAML to enable per-step CoT evaluation.
Test multi-turn policies with realistic 10+ turn conversations — synthetic 2-turn tests miss accumulation effects.
Set explicit max_turns in gateway config to prevent unbounded context growth.
Monitor token usage when enabling CoT — expect 3-5x token consumption compared to direct answers.
Validate context summarization by comparing gateway decision events before and after summarization triggers.

For leaders

Chain-of-thought and branching patterns significantly increase token costs (3-5x per interaction).
Multi-turn policies require ongoing tuning as conversation patterns evolve — plan for quarterly policy reviews.
Conversation branching creates audit trail complexity; define retention and export policies before enabling.
Advanced patterns increase policy evaluation latency — factor this into SLA commitments for interactive chat.

Use this page when​

Primary audience​

Chain-of-Thought Governance​

How CoT Interacts with Policies​

Configuring CoT-Aware Policies​

Best Practices for CoT Under Governance​

Example: Structured CoT Prompt​

Multi-Turn Conversation Policies​

Conversation-Level Policy Types​

Configuring Multi-Turn Policies​

Topic Drift Detection​

Cumulative PII Detection​

Context Window Management​

The Context Window Challenge​

Context Management Strategies​

Sliding Window​

Summarization​

Selective Retention​

Monitoring Context Usage​

Conversation Branching​

How Branching Works​

Use Cases for Branching​

Creating a Branch​

Branch Governance​

Combining Advanced Patterns​

Governed Research Workflow​

Compliance Review Workflow​

Performance Considerations​

Best Practices​

Next steps​

For AI systems​

For engineers​

For leaders​