Skip to main content

Keyword and Pattern-Based Data Loss Prevention for AI

Keyword and pattern-based data loss prevention in Keeptrusts means putting dlp-filter near the front of policies.chain, defining the exact regexes and blocked terms your organization cares about, choosing whether matches should block or return a redact verdict, and then testing those rules against real prompts before rollout.

Use this page when

  • You need to stop prompts that contain secrets, restricted phrases, or organization-specific terms.
  • You want to understand what dlp-filter does today and what it does not do.
  • You need a practical pattern for combining DLP with other input controls such as prompt-injection defense and PII redaction.

Primary audience

  • Primary: Technical Engineers and Security-minded platform owners
  • Secondary: Technical Leaders and Governance reviewers

Start with the right mental model

Keeptrusts has more than one way to protect sensitive data, and they are not interchangeable.

dlp-filter is the custom pattern control. It evaluates only the regexes and literal terms you configure, plus the built-in classification-marker check that activates at higher sensitivity levels.

That means dlp-filter is the right tool for organization-specific secrets, project names, internal phrases, and exact patterns you need to own explicitly.

It is not the built-in personal-data catalog. If you need general identifiers such as payment-card data or personally identifiable information, pair it with the appropriate detectors rather than assuming the DLP filter ships with those libraries preloaded. The DLP docs are explicit about that limitation, and that is useful because it keeps the policy predictable.

In practice, a good chain often looks like this:

  1. Prompt Injection Detection first, so adversarial input is blocked before other policies spend time on it.
  2. dlp-filter next, so custom secrets and restricted phrases are caught early.
  3. Built-in personal-data controls afterward when you need broader redaction behavior.
  4. Output-side controls such as Financial Compliance or Human Oversight only if the use case also needs downstream controls.

What dlp-filter is good at

The main advantage of dlp-filter is precision. You choose the patterns and terms. The gateway does exactly that work and nothing more.

This is useful for:

  • Credential formats such as AWS access-key style strings or GitHub token patterns.
  • Restricted project names or codewords that should never leave the boundary.
  • Release language such as internal use only or do not distribute.
  • Classification-style markers when you want the gateway to react to labels such as secret, confidential, or noforn.

The sensitivity tier matters here. high and restricted do more than a label change. They enable additional context-sensitive marker detection for common classification phrases. That is often more useful than adding dozens of nearly identical blocked terms by hand.

Example: custom secrets and restricted phrases

This example focuses on custom secret formats and literal terms that are specific enough to warrant hard enforcement.

pack:
name: keyword-pattern-dlp
version: 1.0.0
enabled: true

policies:
chain:
- prompt-injection
- dlp-filter
- pii-detector
- audit-logger

policy:
prompt-injection:
use_embedding: false
detection:
attack_patterns:
- "ignore.*previous.*instructions"
- "reveal.*system.*prompt"
encoding:
decode_base64: true
normalize_unicode: true
detect_homoglyphs: true
boundaries:
enforce_delimiters: true
reject_fake_boundaries: true

dlp-filter:
detect_patterns:
- 'AKIA[0-9A-Z]{16}'
- 'ghp_[0-9A-Za-z]{36}'
blocked_terms:
- internal use only
- do not distribute
action: block
fuzzy_matching: true
max_distance: 1
sensitivity_level: high

pii-detector:
action: redact
redaction:
marker_format: label
include_metadata: true

audit-logger:
retention_days: 365

This chain reflects a realistic order of operations.

Prompt injection runs first because attackers often try to hide the very strings your DLP policy is supposed to catch. If the request is clearly adversarial, there is no reason to continue.

dlp-filter then applies the explicit organization-controlled patterns. Because action is set to block, a match stops the request rather than attempting downstream handling.

pii-detector still adds value afterward because not every sensitive item is a custom literal term or regex you maintain yourself.

Block or redact?

The most important authoring decision is whether a DLP hit should block immediately or return a redact verdict.

Choose block when the content should never cross the boundary. Credentials, release labels, internal-only project names, or specific restricted phrases usually belong here.

Choose redact when the business process can still continue after removing the matched content. That pattern is common when the goal is to preserve user intent while stripping sensitive fragments.

Do not default to redaction just because it feels gentler. If the matched phrase is itself the reason the request is unsafe, redaction can create false confidence. A blocked request is often the more honest result.

Fuzzy matching is useful, but only in small doses

fuzzy_matching: true is practical for catching trivial misspellings, spacing changes, or minor edits in blocked terms. It is especially useful for phrases users may paraphrase accidentally.

But fuzzy matching is also where noise enters quickly. Large term lists and high edit distances create false positives. The DLP docs are clear on the safe pattern: keep max_distance small. In most production chains, 1 or 2 is enough.

If you need broader semantic matching rather than literal or near-literal matching, DLP is probably the wrong tool. That is when another policy or an embedding-oriented control becomes the better fit.

Common deployment mistakes

The first mistake is assuming dlp-filter ships with a regulated-data library. It does not. The policy only evaluates your configured patterns and terms, plus the built-in classification markers for higher sensitivity levels.

The second mistake is putting it too late in the chain. By the time a request reaches the provider, the damage is already done.

The third mistake is using generic business words as blocked terms. If the phrase is common in normal work, you will create alert fatigue and policy churn.

The fourth mistake is forgetting the provider side of the boundary. DLP decides what content is allowed to continue. Data Routing Policy decides which provider targets may receive the content that remains.

Validation workflow

Treat DLP as testable configuration, not a static checklist.

kt policy lint --file policy-config.yaml

After lint passes, run representative prompts through the gateway and confirm at least four cases:

  1. A clean prompt that should be allowed.
  2. A secret pattern that should block.
  3. A near-match term that should only trigger because fuzzy matching is enabled.
  4. A classification-marker phrase that should trigger under high or restricted sensitivity.

If your team uses more than one provider, validate the DLP result together with routing. A blocked request should not reach any provider, resilient or otherwise. That sounds obvious, but it is a useful sanity check when you are also working on multi-provider resilience.

Key takeaways

  • dlp-filter is the custom pattern and blocked-term control, not a built-in universal data-classification engine.
  • Put it early in the chain, usually after prompt-injection defense and before broader redaction controls.
  • Use block for content that must never cross the boundary and redact only when downstream continuation is acceptable.
  • Keep fuzzy matching conservative so the policy stays credible.
  • Pair content controls with routing controls when provider eligibility depends on data-handling guarantees.

Next steps