PII Detection and Automatic Redaction in LLM Requests

Keeptrusts handles PII detection and automatic redaction by running pii-detector on request content before the upstream call and by using the same policy presence to power buffered response redaction on the way back. In practical terms, that means sensitive identifiers can be replaced before they leave your boundary, and if the model mirrors those identifiers back in the response, the gateway can redact them there too.

Use this page when

You need to stop emails, SSNs, phone numbers, card data, or internal identifiers from reaching the provider in raw form.
You want one gateway policy to cover both request sanitization and response redaction behavior.
You are deciding between redaction and hard blocking.

Primary audience

Primary: Technical Engineers
Secondary: Technical Leaders, AI Agents

The problem

LLM traffic often includes more sensitive data than teams expect. A support prompt can include names and account numbers. A healthcare workflow can include MRNs or address fragments. An internal assistant can leak employee identifiers or private URLs. Once those values are sent upstream, later masking in logs does not undo the disclosure.

That is why PII handling has to be inline. The right question is not “Can we clean this up after the call?” It is “Should the provider ever receive this string in the first place?” For many workloads, the answer is no.

The other complication is that there is no single PII shape. General identifiers, PCI-style payment data, healthcare-style text, and company-specific IDs all need different treatment. A usable solution has to let you start with common built-in detectors and then extend the detection surface with custom regexes for the identifiers your business actually uses.

The solution

pii-detector is Keeptrusts’ shared redaction control. On the request path, it scans the joined request text and returns either a redaction verdict or a block verdict depending on action. On the response path, when the policy is active in the chain, the gateway can use the same redaction configuration to sanitize buffered output before returning it.

The policy gives you a few important switches.

action controls whether matches are redacted or cause a hard request-phase block. healthcare_mode adds HIPAA-style heuristics such as names, addresses, MRNs, and related medical identifiers. pci_mode turns on payment-card-oriented detections. detect_patterns adds your own regexes, and those custom matches are recorded as generic_id. redaction.marker_format, include_metadata, and custom_markers control how the replacements appear and how much audit detail is retained.

The useful point here is consistency. You do not need one mechanism for the request body and another for the returned response. You configure the redaction behavior once and apply it at the gateway boundary.

Implementation

This is a solid starting point for a general business workflow: redact common identifiers, keep PCI checks on, add one custom business ID pattern, and retain metadata so operators can review what was changed.

pack:
  name: pii-protection
  version: "1.0.0"
  enabled: true

providers:
  targets:
    - id: openai-primary
      provider: openai
      model: gpt-5.4-mini-mini
      secret_key_ref:
        env: OPENAI_API_KEY

policies:
  chain:
    - prompt-injection
    - pii-detector
    - audit-logger

policy:
  prompt-injection:
    use_embedding: false
    detection:
      attack_patterns:
        - "ignore.*previous.*instructions"
    encoding:
      decode_base64: true
      normalize_unicode: true
      detect_homoglyphs: true
    boundaries:
      enforce_delimiters: true
      reject_fake_boundaries: true

  pii-detector:
    action: redact
    healthcare_mode: false
    pci_mode: true
    detect_patterns:
      - 'EMP-\d{6}'
    redaction:
      marker_format: label
      include_metadata: true
      custom_markers:
        generic_id: "[REDACTED-ID]"

  audit-logger:
    retention_days: 365

Validate and run it:

kt policy lint --file policy-config.yaml
kt gateway run --policy-config policy-config.yaml --listen 0.0.0.0:41002

Then send a request that includes both standard and custom identifiers:

curl -s http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4-mini-mini",
    "messages": [
      {
        "role": "user",
        "content": "Summarize this ticket: employee EMP-123456 reported that john@example.com used SSN 123-45-6789 in a support form."
      }
    ]
  }' | jq .

If you want to inspect what happened in more detail, tail the event stream after the request:

kt events tail --last 1 --verbose

That combination gives you both functional and operational evidence. Functionally, the request can be sanitized before the upstream call. Operationally, the gateway records which redaction rule fired and what kind of content was modified. If you use custom regexes, remember that the detector records them as generic_id, so the right place to customize the replacement label is redaction.custom_markers.generic_id.

If a route should never pass matched PII upstream under any condition, change action from redact to block. That is a stricter posture, but it should be driven by workflow requirements rather than habit. Many teams start with redaction because it preserves utility while still removing the sensitive spans.

Results and impact

The immediate impact is simple: less raw sensitive data leaves your perimeter. That matters for privacy, compliance, and plain operational discipline. You are not asking downstream providers to be the first line of protection for information that should already have been sanitized.

The second impact is consistency. A shared redaction control at the gateway means every integrated app gets the same treatment. Support automation, internal tooling, chat workflows, and agent frameworks all inherit the same detector behavior when they route through the gateway.

The third impact is reviewability. Because Keeptrusts can retain redaction metadata, teams can verify that the policy is doing what they intended instead of relying on a black-box claim that “the data is protected.” That is much easier to defend in operations and audits.

Key takeaways

pii-detector is the main Keeptrusts control for request-side PII sanitization.
With action: redact, the gateway can replace sensitive spans before the provider sees them.
The same policy also powers buffered response redaction when it is present in the chain.
healthcare_mode, pci_mode, and detect_patterns let you tune the detector for real workloads.
Start with redaction and metadata, then move to hard blocking only where the workflow requires it.

PII Detection and Automatic Redaction in LLM Requests

Use this page when​

Primary audience​

The problem​

The solution​

Implementation​

Results and impact​

Key takeaways​

Next steps​