Skip to main content

Security Monitoring: Building Dashboards from AI Governance Events

Security Monitoring: Building Dashboards from AI Governance Events

Most AI dashboards are built around latency, cost, and provider uptime. Those metrics matter, but they do not tell a security team whether governance is working. A useful security dashboard answers different questions: what is being blocked, what is being redacted, what is escalating to humans, which provider path is being used, and whether behavior changed after a new policy version rolled out. Keeptrusts gives you the raw material for that because every governed request can produce a decision event with reviewable metadata.

Use this page when

  • You want to build operational or compliance dashboards from the Keeptrusts decision stream.
  • You need a practical metric model for blocked, redacted, escalated, and allowed traffic.
  • You want dashboards that support both daily monitoring and evidence handoff.

Primary audience

  • Primary: Security analysts, platform engineers, and detection engineers
  • Secondary: Technical Leaders reviewing governance posture

The problem

Security monitoring for AI often stalls because teams treat the model provider as the only telemetry source worth graphing. That gives you usage and latency, but it hides the governance decisions that matter most. If a prompt injection wave is getting blocked cleanly, provider telemetry alone will not tell you that. If secret patterns are being caught by content controls, the model provider should ideally never see them at all. The governance layer is where those decisions become visible.

The second problem is unstructured monitoring. Teams open a raw event export when an incident happens, but they do not define steady-state views ahead of time. That means nobody notices gradual drift: rising block rates after a policy change, increasing redactions from a specific workflow, or a sudden concentration of suspicious reason codes. Good dashboards do not replace investigations. They make investigations faster because the anomaly is already framed.

The solution

Build dashboards from decision events and keep the metric set deliberately small.

At minimum, track counts by verdict, reason_code, model, provider, and config_version. Those fields already tell a strong security story. Verdict distribution shows whether the governance layer is mostly passing or actively intervening. Reason-code concentration tells you which failure modes are dominant. Provider and model breakdowns show where risk is being exercised. Config-version grouping helps you answer the question every operator gets after rollout day: did behavior change because of the workload, or because of the policy?

Then layer workflow metrics on top. Use Reviewing Alerts and Evidence and How To: Resolve an Escalation as the operational frame for what happens after a dashboard spike appears. Dashboards should not end at “there are more blocks today.” They should make it easy to pivot into evidence, time windows, and related event IDs.

For architecture, keep the source of truth simple: kt events for query and export, Event-Driven AI Architecture for downstream pipelines, and Security Analyst Guide: AI Threat Monitoring for detection workflows. The dashboard may live in your SIEM or analytics stack, but the governance meaning comes from the event stream.

Implementation

Start by ensuring your policy chain produces events worth monitoring. A minimal stack below covers several common security signals without making the dashboard dependent on hypothetical features.

pack:
name: dashboard-signal-baseline
version: 1.0.0
enabled: true

policies:
chain:
- prompt-injection
- pii-detector
- bot-detector
- data-routing-policy
- audit-logger

policy:
prompt-injection:
use_embedding: true
detection:
embedding_threshold: 0.8
encoding:
decode_base64: true
normalize_unicode: true
detect_homoglyphs: true
boundaries:
enforce_delimiters: true
reject_fake_boundaries: true

pii-detector:
action: redact
redaction:
marker_format: label
include_metadata: true

bot-detector:
action: warn
similarity_threshold: 0.9
max_requests_per_window: 5

data-routing-policy:
require_zero_data_retention: true
require_no_training: true
max_retention_days: 0
on_no_compliant_provider: block
log_provider_selection: true

audit-logger: {}

Once the gateway is emitting meaningful decisions, export a window and derive dashboard primitives from it:

kt events export --since 24h --format json --output dashboard-source.json

jq 'group_by(.verdict) | map({verdict: .[0].verdict, count: length})' dashboard-source.json

jq 'group_by(.reason_code) | map({reason_code: .[0].reason_code, count: length}) | sort_by(-.count) | .[0:10]' dashboard-source.json

jq 'group_by(.config_version) | map({config_version: .[0].config_version, count: length})' dashboard-source.json

That is enough to power useful daily views:

  • verdict mix by hour or day
  • top reason codes in the last 24 hours
  • model and provider distribution for blocked or redacted events
  • behavior changes before and after a config-version rollout

Keep the first version of the dashboard operational, not decorative. The best first chart is usually the one that tells an on-call engineer where to investigate next.

Results and impact

When teams monitor governance events directly, they stop mistaking provider health for governance health. A system can be fast and available while still leaking secrets, over-blocking users, or sending the wrong workload to the wrong provider. Event-based dashboards expose those failures early.

They also improve communication. Security teams get concrete signals instead of anecdotes. Engineering leaders get trend lines they can connect to policy rollouts. Compliance stakeholders get a clearer path from metrics to evidence export. One event stream serves all three audiences if the dashboard is built on the right dimensions.

Key takeaways

  • Build AI security dashboards from decision events, not only from provider metrics.
  • Start with verdict, reason_code, model, provider, and config_version before adding anything fancier.
  • Use dashboards to point investigators toward Reviewing Alerts and Evidence, not to replace investigation.
  • kt events and exports are the right starting point for downstream analytics.
  • A small, interpretable dashboard is better than a broad one nobody uses during an incident.

Next steps