PII Detection and Automatic Redaction in LLM Requests
Keeptrusts handles PII detection and automatic redaction by running pii-detector on request content before the upstream call and by using the same policy presence to power buffered response redaction on the way back. In practical terms, that means sensitive identifiers can be replaced before they leave your boundary, and if the model mirrors those identifiers back in the response, the gateway can redact them there too.
Use this page when
- You need to stop emails, SSNs, phone numbers, card data, or internal identifiers from reaching the provider in raw form.
- You want one gateway policy to cover both request sanitization and response redaction behavior.
- You are deciding between redaction and hard blocking.
Primary audience
- Primary: Technical Engineers
- Secondary: Technical Leaders, AI Agents
The problem
LLM traffic often includes more sensitive data than teams expect. A support prompt can include names and account numbers. A healthcare workflow can include MRNs or address fragments. An internal assistant can leak employee identifiers or private URLs. Once those values are sent upstream, later masking in logs does not undo the disclosure.
That is why PII handling has to be inline. The right question is not “Can we clean this up after the call?” It is “Should the provider ever receive this string in the first place?” For many workloads, the answer is no.
The other complication is that there is no single PII shape. General identifiers, PCI-style payment data, healthcare-style text, and company-specific IDs all need different treatment. A usable solution has to let you start with common built-in detectors and then extend the detection surface with custom regexes for the identifiers your business actually uses.
The solution
pii-detector is Keeptrusts’ shared redaction control. On the request path, it scans the joined request text and returns either a redaction verdict or a block verdict depending on action. On the response path, when the policy is active in the chain, the gateway can use the same redaction configuration to sanitize buffered output before returning it.
The policy gives you a few important switches.
action controls whether matches are redacted or cause a hard request-phase block. healthcare_mode adds HIPAA-style heuristics such as names, addresses, MRNs, and related medical identifiers. pci_mode turns on payment-card-oriented detections. detect_patterns adds your own regexes, and those custom matches are recorded as generic_id. redaction.marker_format, include_metadata, and custom_markers control how the replacements appear and how much audit detail is retained.
The useful point here is consistency. You do not need one mechanism for the request body and another for the returned response. You configure the redaction behavior once and apply it at the gateway boundary.
Implementation
This is a solid starting point for a general business workflow: redact common identifiers, keep PCI checks on, add one custom business ID pattern, and retain metadata so operators can review what was changed.
pack:
name: pii-protection
version: "1.0.0"
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-5.4-mini-mini
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- prompt-injection
- pii-detector
- audit-logger
policy:
prompt-injection:
use_embedding: false
detection:
attack_patterns:
- "ignore.*previous.*instructions"
encoding:
decode_base64: true
normalize_unicode: true
detect_homoglyphs: true
boundaries:
enforce_delimiters: true
reject_fake_boundaries: true
pii-detector:
action: redact
healthcare_mode: false
pci_mode: true
detect_patterns:
- 'EMP-\d{6}'
redaction:
marker_format: label
include_metadata: true
custom_markers:
generic_id: "[REDACTED-ID]"
audit-logger:
retention_days: 365
Validate and run it:
kt policy lint --file policy-config.yaml
kt gateway run --policy-config policy-config.yaml --listen 0.0.0.0:41002
Then send a request that includes both standard and custom identifiers:
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4-mini-mini",
"messages": [
{
"role": "user",
"content": "Summarize this ticket: employee EMP-123456 reported that john@example.com used SSN 123-45-6789 in a support form."
}
]
}' | jq .
If you want to inspect what happened in more detail, tail the event stream after the request:
kt events tail --last 1 --verbose
That combination gives you both functional and operational evidence. Functionally, the request can be sanitized before the upstream call. Operationally, the gateway records which redaction rule fired and what kind of content was modified. If you use custom regexes, remember that the detector records them as generic_id, so the right place to customize the replacement label is redaction.custom_markers.generic_id.
If a route should never pass matched PII upstream under any condition, change action from redact to block. That is a stricter posture, but it should be driven by workflow requirements rather than habit. Many teams start with redaction because it preserves utility while still removing the sensitive spans.
Results and impact
The immediate impact is simple: less raw sensitive data leaves your perimeter. That matters for privacy, compliance, and plain operational discipline. You are not asking downstream providers to be the first line of protection for information that should already have been sanitized.
The second impact is consistency. A shared redaction control at the gateway means every integrated app gets the same treatment. Support automation, internal tooling, chat workflows, and agent frameworks all inherit the same detector behavior when they route through the gateway.
The third impact is reviewability. Because Keeptrusts can retain redaction metadata, teams can verify that the policy is doing what they intended instead of relying on a black-box claim that “the data is protected.” That is much easier to defend in operations and audits.
Key takeaways
pii-detectoris the main Keeptrusts control for request-side PII sanitization.- With
action: redact, the gateway can replace sensitive spans before the provider sees them. - The same policy also powers buffered response redaction when it is present in the chain.
healthcare_mode,pci_mode, anddetect_patternslet you tune the detector for real workloads.- Start with redaction and metadata, then move to hard blocking only where the workflow requires it.