Skip to main content

Reading the Dashboard: AI Usage Patterns and Anomalies

You should read the Keeptrusts dashboard as an operational triage surface, not a vanity report: start with the selected time window, check whether queued escalations need human attention, verify gateway health and period spend, then use the quality trend, verdict breakdown, and recent events list to decide whether the change you are seeing is normal usage, a policy regression, a reviewer backlog, or a platform problem.

Use this page when

  • You need a practical method for interpreting the Overview dashboard.
  • You are trying to understand whether a spike in blocks, cost, or review load is expected.
  • You want to turn dashboard signals into concrete follow-up actions in Events, Escalations, Configurations, or spend workflows.

Primary audience

  • Primary: Technical Engineers and Platform Operators
  • Secondary: Technical Leaders and Governance reviewers

The problem

Most dashboards answer only one question well. Some tell you how much traffic occurred. Others tell you what it cost. Others show whether infrastructure is healthy. AI governance needs all three, plus reviewer state and policy outcomes. If you look at only one metric, it is easy to misdiagnose the issue.

Consider four common situations.

The first is a genuine traffic shift. Requests increase because a product team launched a new feature. In that case, higher event volume and spend may be normal.

The second is a policy regression. The total number of requests stays steady, but blocked or escalated outcomes jump right after a configuration rollout.

The third is a reviewer bottleneck. Policy behavior is stable, but queued escalations accumulate because nobody is claiming them fast enough.

The fourth is a platform or provider problem. Gateways go offline, or the spend mix changes sharply because requests are being routed differently than expected.

If you do not read the dashboard systematically, these can look similar. Teams often react to the loudest number on the page instead of following the evidence. That leads to unnecessary rollback, delayed incident response, or missed cost anomalies.

The solution

The Overview dashboard is useful because it puts related signals next to each other.

At the top, the time range selector sets the frame for everything else. That sounds basic, but it is often the difference between a real anomaly and a normal weekly cycle. A seven-day window may hide a bad hour. A one-hour window may exaggerate a routine daily peak.

The page then raises urgent reviewer work directly. If there are queued escalations, the dashboard surfaces that fact so human review does not get buried under aggregate metrics.

From there, read the page in layers.

The KPI row gives a compact state summary. The console pairs event-driven summaries with gateway fleet health and period spend. This immediately helps you distinguish a policy issue from a gateway or cost issue.

The quality trend helps you see whether allowed traffic is still producing acceptable outcomes. This is different from pure enforcement. A system can allow most requests and still deliver low-quality results.

The verdict breakdown helps you understand the mix of allowed, blocked, redacted, and escalated traffic. That tells you whether the policy chain is acting differently than expected.

The recent events list gives you the fastest route from signal to evidence. Instead of guessing why a chart moved, you can open the last several events and inspect what really happened.

Implementation

Use the dashboard the same way every time. Repetition matters because you want the reading process to be reliable when something actually goes wrong.

  1. Pick the right time window before interpreting any number.
  2. Look for the review-queue alert. If queued escalations are visible, decide whether the operational priority is human review before deeper analytics.
  3. Check gateway status. If gateways are partially offline, downstream behavior may be skewed and policy conclusions will be premature.
  4. Check period spend and provider mix. If spend rose but request volume did not, inspect routing and provider distribution before touching policies.
  5. Review the quality trend. A quality drop without a matching rise in blocks may indicate poor grounding or model drift rather than stricter enforcement.
  6. Review the verdict breakdown. This tells you whether the platform is allowing, blocking, redacting, or escalating traffic differently than before.
  7. Open the recent events list and inspect a sample of representative requests.

When you need to verify what the UI is showing, use the CLI to look at the same recent decision stream directly:

kt events tail --limit 20 --json

That command is especially helpful when a chart suggests a change but you need request-level confirmation. The dashboard gives the pattern. Events give the facts.

Here is a concrete investigation workflow for a common anomaly: a sudden rise in escalations after a policy rollout.

  1. Open Overview with a 24h or 7d view.
  2. Confirm whether queued escalations are elevated and whether gateways remain healthy.
  3. Review the verdict breakdown to see whether the rise is limited to escalations or also affects blocks.
  4. Open several recent escalated events and note the matched policies, request metadata, and configuration version.
  5. Compare that config version in Configurations. If the change is recent, inspect the YAML diff or version history.
  6. If the queue is operationally risky, use Escalations to claim and resolve items while the policy owner decides whether to tune or roll back the configuration.

Another common workflow is a spend anomaly without a large request increase.

  1. Open Overview and confirm period spend moved more than expected.
  2. Check the top provider and gateway status.
  3. Open the Usage or spend view for detail.
  4. Cross-check recent events for provider changes, larger prompts, or routing shifts.
  5. If the shift lines up with a new configuration, inspect routing or provider-related policy changes in Configurations.
  6. If wallet pressure follows, move to wallet or cost-ticket workflows before traffic is held by insufficient balance.

The dashboard also helps with slow-moving problems, not just spikes. If queued escalations are modest every day but never decline, that is a reviewer-capacity issue. If blocks rise each week after template reuse by new teams, that is a rollout-pattern issue. If quality falls while Knowledge Base updates lag behind documentation changes, the dashboard signal may be pointing you toward a content maintenance problem instead of a policy one.

The important habit is to treat each dashboard widget as a branching point. Gateway health branches to infrastructure review. Spend branches to wallet and provider analysis. Escalation banners branch to human review. Verdict mix branches to policy evaluation. Recent events branch to exact evidence.

Results and impact

Teams that read the dashboard this way get better at separating cause from correlation. A block spike is no longer assumed to be a bad policy. It might be a real increase in risky prompts. A spend jump is no longer assumed to be abuse. It might be a provider-routing change or a new workload launch. A growing escalation queue is no longer assumed to mean the model is failing. It might simply mean the reviewers are understaffed for the current volume.

That distinction matters because the follow-up actions are different. Policy tuning, reviewer staffing, gateway remediation, and wallet reallocation are not interchangeable responses. The Overview dashboard is valuable precisely because it does not force you to choose one lens too early.

It also improves communication between roles. Engineers can explain a change using event and config evidence. Leaders can watch trends without losing the ability to drill into the underlying requests. Reviewers can tie queue health back to real event volume. Finance owners can connect spend movement to governed traffic rather than reading invoices in isolation.

In practice, the dashboard is most useful when it becomes part of a routine. Daily checks catch reviewer backlog. Weekly checks catch trend shifts. Post-rollout checks catch regressions before they become incidents. Those are small habits, but they are what turn governance data into operational control.

Key takeaways

  • Read the Overview dashboard from top to bottom: time window, review queue, gateways, spend, quality, verdict mix, then recent events.
  • Do not interpret a single metric without checking the surrounding signals.
  • Use recent events and kt events tail to move from dashboard pattern to request-level evidence.
  • Treat queued escalations as an operational indicator, not just a compliance workflow.
  • Use the dashboard to choose the next surface: Events, Escalations, Configurations, or spend and wallet views.

Next steps