Advanced Chat Patterns
As your organization's AI chat usage matures, you encounter scenarios that require more sophisticated governance patterns. This guide covers advanced techniques for managing complex conversations, chain-of-thought reasoning, context windows, and conversation branching within the Keeptrusts policy framework.
Use this page when
- You need to govern chain-of-thought reasoning steps produced by LLMs in your chat sessions.
- You are designing multi-turn conversation policies that accumulate context across turns.
- You want to manage context window growth, summarize prior turns, or enforce turn limits.
- You are implementing conversation branching for exploratory analysis workflows.
Primary audience
- Primary: AI Engineers building multi-turn chat applications, Platform Engineers configuring advanced policies
- Secondary: Technical Leaders evaluating governance complexity for mature chat deployments
Chain-of-Thought Governance
Chain-of-thought (CoT) prompting encourages the LLM to show its reasoning step by step. In a governed environment, each reasoning step is subject to policy evaluation.
How CoT Interacts with Policies
When the LLM produces chain-of-thought reasoning:
- The full output — including intermediate reasoning steps — passes through output policies.
- Policies evaluate both the reasoning chain and the final answer.
- If an intermediate step contains policy-violating content, the entire response may be blocked or redacted.
Configuring CoT-Aware Policies
To support chain-of-thought while maintaining governance:
pack:
name: advanced-chat-patterns-example-1
version: 1.0.0
enabled: true
policies:
chain:
- output
policy:
output: {}
This configuration allows policies to redact sensitive content from reasoning steps while preserving the final answer, provided the answer itself passes policy evaluation.
Best Practices for CoT Under Governance
| Practice | Rationale |
|---|---|
| Use structured CoT prompts | Separates reasoning from conclusions for targeted policy evaluation |
| Request reasoning in a specific format | Makes policy pattern matching more reliable |
| Test CoT with your active policies | Ensures reasoning steps do not trigger false positives |
| Monitor token usage | CoT responses use significantly more tokens |
Example: Structured CoT Prompt
Analyze the following contract clause for compliance risks.
## Reasoning
Walk through your analysis step by step, identifying specific
risk factors and citing relevant regulations.
## Conclusion
Provide a risk rating (Low/Medium/High) and a one-paragraph summary.
Structuring the output this way allows policies to evaluate the reasoning and conclusion sections independently.
Multi-Turn Conversation Policies
Multi-turn policies evaluate the full conversation context, not just individual messages. This enables governance patterns that account for conversational dynamics.
Conversation-Level Policy Types
| Policy Type | Description |
|---|---|
| Topic drift detection | Flags when conversation strays from the initial topic into restricted areas |
| Cumulative PII detection | Detects PII spread across multiple messages that individually seem benign |
| Escalation patterns | Triggers escalation when repeated policy-adjacent prompts suggest probing |
| Context poisoning detection | Identifies attempts to gradually shift the assistant's behavior |
| Turn-count limits | Restricts conversation length to manage cost and context quality |
Configuring Multi-Turn Policies
pack:
name: advanced-chat-patterns-example-2
version: 1.0.0
enabled: true
policies:
chain:
- conversation
policy:
conversation: {}
Topic Drift Detection
Topic drift policies monitor the semantic similarity between the conversation's initial topic and subsequent messages:
- The first few messages establish the topic baseline.
- Each subsequent message is compared against the baseline.
- If similarity drops below the threshold, the policy intervenes.
This prevents users from starting an approved conversation and gradually steering it toward restricted content.
Cumulative PII Detection
Individual messages may not contain PII, but across multiple turns, a user might reveal:
- A name in turn 1
- An email address in turn 3
- A phone number in turn 5
Conversation-scoped PII detection aggregates entities across turns and triggers when the cumulative PII exceeds the configured threshold.
Context Window Management
LLMs have finite context windows. As conversations grow, managing what stays in context becomes critical for both quality and cost.
The Context Window Challenge
Each message in a multi-turn conversation consumes context space:
| Component | Token Impact |
|---|---|
| System prompt | Fixed overhead per message |
| Knowledge assets | Variable, depends on recall |
| Conversation history | Grows with each turn |
| Current prompt | Variable per message |
| Reserved for response | Set by max_tokens parameter |
When the total exceeds the model's context window, older messages must be truncated or summarized.
Context Management Strategies
Sliding Window
Keep only the most recent N turns in context:
chat:
context:
strategy: sliding_window
window_size: 20
Pros: Simple, predictable token usage. Cons: Loses early conversation context.
Summarization
Periodically summarize older turns into a condensed representation:
chat:
context:
strategy: summarize
summarize_after: 10
summary_model: gpt-4o-mini
Pros: Preserves key information from earlier turns. Cons: Adds latency and cost for the summarization step.
Selective Retention
Keep specific high-value turns (e.g., initial instructions, key decisions) while dropping routine exchanges:
chat:
context:
strategy: selective
always_retain:
- first_turn
- pinned_turns
drop_after: 15
Pros: Retains the most important context. Cons: Requires configuration to identify high-value turns.
Monitoring Context Usage
Track context window utilization in the console:
- Navigate to Events and select a chat event.
- Review the
context_tokensfield to see how much context was used. - Compare against the model's maximum context window.
- Set alerts for conversations approaching context limits.
Conversation Branching
Conversation branching allows users to explore alternative paths from a specific point in a conversation without losing the original thread.
How Branching Works
- A user reaches a point in the conversation where they want to explore an alternative approach.
- They create a branch from that message.
- The branch starts a new conversation thread that shares history up to the branch point.
- The original conversation remains unchanged.
Use Cases for Branching
| Scenario | Benefit |
|---|---|
| Exploring different analysis approaches | Compare outcomes without losing the original |
| Testing policy boundaries | See how different phrasings affect governance |
| A/B testing prompts | Evaluate which prompt produces better results |
| Iterative refinement | Branch to try variations while keeping the best path |
Creating a Branch
In the Chat Workbench:
- Hover over the message you want to branch from.
- Click the Branch icon.
- A new conversation tab opens with the shared history.
- Continue the conversation in the new branch.
Branch Governance
Branches inherit the governance context of the parent conversation:
- Policy evaluations consider the full history up to the branch point.
- Each branch is an independent conversation from the governance perspective after the branch point.
- Decision events for branches include a reference to the parent conversation.
Combining Advanced Patterns
These patterns work together for sophisticated use cases:
Governed Research Workflow
- Start a conversation with a broad research question.
- Use chain-of-thought prompting for detailed analysis.
- Branch the conversation to explore alternative hypotheses.
- Multi-turn policies prevent topic drift into restricted areas.
- Context management keeps the conversation focused as it grows.
- Export the full conversation tree for documentation.
Compliance Review Workflow
- Begin with a structured compliance question referencing bound knowledge assets.
- The system prompt enforces formal output formatting.
- Multi-turn policies detect cumulative PII across review turns.
- Branch to explore different regulatory interpretations.
- Export branches and their policy evaluations as compliance evidence.
Performance Considerations
Advanced patterns add overhead:
| Pattern | Overhead | Mitigation |
|---|---|---|
| CoT governance | Longer outputs = more policy evaluation time | Use structured formats for efficient evaluation |
| Multi-turn policies | Full conversation re-evaluation per turn | Cache policy results for unchanged turns |
| Context summarization | Extra LLM call for summarization | Use a fast, cheap model for summaries |
| Branching | Multiple conversation histories to maintain | Set branch limits per conversation |
Best Practices
| Practice | Why It Matters |
|---|---|
| Start with simple patterns and add complexity | Avoid over-engineering governance |
| Monitor token costs for advanced patterns | CoT and branching increase costs significantly |
| Test multi-turn policies with realistic conversations | Synthetic tests miss real-world conversation dynamics |
| Set conversation turn limits | Prevents unbounded context growth and cost |
| Document branching decisions | Creates an audit trail for exploratory analysis |
| Review context management strategy quarterly | Adjust as models offer larger context windows |
Next steps
- Return to the basics in Getting Started with the Chat Workbench.
- Explore how these patterns interact with knowledge assets in Knowledge-Grounded Chat Conversations.
- Configure the parameters that control these patterns in Customizing the Chat Experience.
- Track the impact of advanced patterns on usage in Chat Analytics & Usage Insights.
For AI systems
- Canonical terms: chain-of-thought governance, multi-turn policies, context window management, conversation branching, CoT-aware redaction, turn-level evaluation.
- Config names:
evaluate_reasoning_steps,redact_intermediate,preserve_final_answer,max_turns,context_summarization. - Best next pages: Getting Started with the Chat Workbench, Customizing the Chat Experience, Chat Analytics.
For engineers
- Configure
evaluate_reasoning_steps: truein output policy YAML to enable per-step CoT evaluation. - Test multi-turn policies with realistic 10+ turn conversations — synthetic 2-turn tests miss accumulation effects.
- Set explicit
max_turnsin gateway config to prevent unbounded context growth. - Monitor token usage when enabling CoT — expect 3-5x token consumption compared to direct answers.
- Validate context summarization by comparing gateway decision events before and after summarization triggers.
For leaders
- Chain-of-thought and branching patterns significantly increase token costs (3-5x per interaction).
- Multi-turn policies require ongoing tuning as conversation patterns evolve — plan for quarterly policy reviews.
- Conversation branching creates audit trail complexity; define retention and export policies before enabling.
- Advanced patterns increase policy evaluation latency — factor this into SLA commitments for interactive chat.