Choosing & Switching AI Models
This tutorial covers how to browse available models in the Keeptrusts chat workbench, compare their capabilities, switch models during a conversation, and understand the cost associated with each model.
Use this page when
- You need to browse available models, compare their capabilities, and select the right one for your task.
- You want to switch models mid-conversation and understand how context history is preserved.
- You are evaluating cost, speed, and context window trade-offs between models.
Primary audience
- Primary: Technical Engineers (daily chat users choosing models)
- Secondary: Technical Leaders (model availability decisions), AI Agents (model routing)
Prerequisites
- Authenticated access to the Keeptrusts chat workbench
- A gateway with at least two model providers configured
- Familiarity with the first conversation tutorial
Step 1: Open the Model Picker
- In the chat workbench, locate the Model Selector at the top of the chat interface.
- Click the dropdown to expand the list of available models.
The models shown here are determined by your gateway's provider configuration. Only models that your administrator has enabled will appear.
Step 2: Understand Model Availability
Each model entry in the picker displays:
| Field | Description |
|---|---|
| Model name | The provider and model identifier (e.g., openai/gpt-4o) |
| Provider | The upstream LLM provider (OpenAI, Anthropic, Google, etc.) |
| Status | Whether the model is currently reachable |
Models are resolved through your gateway configuration. A typical provider block in your gateway's policy config looks like:
pack:
name: model-selection-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: openai
provider:
secret_key_ref:
store: OPENAI_API_KEY
- id: anthropic
provider:
secret_key_ref:
store: ANTHROPIC_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
If a model you expect is missing from the picker, ask your administrator to verify the gateway's provider configuration.
Step 3: Compare Model Capabilities
Different models have different strengths. Here is a general comparison to guide your selection:
| Model | Best For | Context Window | Relative Speed |
|---|---|---|---|
| GPT-4o | Complex reasoning, code generation | 128K tokens | Medium |
| GPT-4o-mini | Quick tasks, lower cost | 128K tokens | Fast |
| Claude Sonnet | Long-form analysis, nuanced writing | 200K tokens | Medium |
| Claude Haiku | Fast responses, simple tasks | 200K tokens | Very fast |
Step 4: Select a Model
- Click the desired model in the dropdown.
- The selector updates to show your chosen model.
- Your next message will be routed to this model through the gateway.
Step 5: Switch Models Mid-Conversation
You can change models at any point during a conversation:
- Click the Model Selector dropdown.
- Choose a different model.
- Continue typing your next message.
The chat workbench sends the full conversation history to the newly selected model, so context is preserved across the switch.
Step 6: Understand Cost Per Model
Keeptrusts tracks token usage and cost per model. Each model has different pricing based on input and output tokens.
To view cost information:
- Open the management console.
- Navigate to Cost Center or Spend in the sidebar.
- Filter by model to see per-model usage and costs.
Typical pricing tiers (actual prices depend on your provider agreements):
| Model | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) |
|---|---|---|
| GPT-4o | Higher | Higher |
| GPT-4o-mini | Lower | Lower |
| Claude Sonnet | Higher | Higher |
| Claude Haiku | Lower | Lower |
If your organization uses Keeptrusts wallet-based spend controls, each message deducts from your team or user wallet based on estimated token cost. The gateway reserves the estimated cost before forwarding the request and settles the actual cost when the response completes.
Step 7: Verify the Model Used in Events
After sending a message, you can confirm which model processed it:
- Go to the management console Events page.
- Open the event for your most recent message.
- The event detail shows the provider and model fields, confirming exactly which model was used.
This is useful for auditing and for verifying that model switching worked as expected.
Step 8: Use Model Selection Strategically
Here are practical strategies for choosing models:
Cost optimization
Start conversations with a lower-cost model (e.g., GPT-4o-mini or Claude Haiku). If the response quality is insufficient, switch to a more capable model for the follow-up.
Task-specific selection
- Code review or generation: Use GPT-4o or Claude Sonnet for better reasoning.
- Summarization: Claude models with large context windows handle long documents well.
- Quick Q&A: GPT-4o-mini or Claude Haiku provide fast, cost-effective answers.
Policy interaction
Some policies may behave differently depending on the model's output characteristics. If a policy frequently modifies or blocks responses from one model, try a different model that produces outputs more aligned with your organization's governance rules.
Troubleshooting
| Problem | Solution |
|---|---|
| Only one model appears | Check the gateway provider config — additional providers may need API keys |
| Model shows as unreachable | The upstream provider may be down or the API key may be invalid |
| High costs on a specific model | Switch to a smaller model for routine tasks; reserve large models for complex work |
| Model switch did not take effect | Verify the selector shows the new model before sending your next message |
Next steps
- Using Knowledge Base in Chat — enrich model responses with organizational context.
- Streaming Responses & Real-Time Chat — understand how responses are displayed in real time.
- Exporting Conversations for Audit — export conversations with model metadata for compliance.
For AI systems
- Canonical terms: Keeptrusts chat workbench, model selector, model picker, gateway providers config,
secret_key_ref, model switching, context preservation, provider block (YAML). - Model identifiers:
openai/gpt-4o,openai/gpt-4o-mini,anthropic/claude-sonnet-4-20250514,anthropic/claude-3-5-haiku-20241022. - Config:
providerssection in gateway policy YAML withname,secret_key_ref.store, andmodelslist. - Best next pages: Knowledge Injection, Streaming Responses, Conversation Export.
For engineers
- Prerequisites: at least two model providers configured in your gateway; familiarity with provider YAML syntax (
providers.*.secret_key_ref.store). - Validation: Open model picker → verify expected models appear. Switch model mid-conversation → verify context preserved and new model generates the next response. Missing model = check gateway provider config and API key validity.
- Cost strategy: Start with GPT-4o-mini/Haiku for routine tasks; escalate to GPT-4o/Sonnet for complex reasoning.
For leaders
- Model selection directly impacts cost and quality — empowering users to choose appropriately reduces waste.
- Admin controls can restrict model availability per team to enforce budget boundaries.
- Policy behavior may vary by model output characteristics; monitor policy trigger rates per model in analytics.
- Consider setting cost-effective defaults while allowing power users to escalate to premium models.