Skip to main content
Browse docs

Tutorial: Setting Up PII Redaction

This tutorial shows you how to configure the Keeptrusts gateway to automatically detect and redact personally identifiable information (PII) before it reaches your LLM provider.

Use this page when

  • You are configuring the gateway to detect and redact PII (email, phone, SSN, credit card) from LLM traffic.
  • You want to control redaction format and audit metadata using the current pii-detector fields.
  • You need to extend built-in detection with healthcare, PCI, or custom regex patterns.
  • You are verifying that redaction markers replace real data before it reaches the provider.

Primary audience

  • Primary: Platform engineers and privacy teams implementing data protection at the AI gateway
  • Secondary: Compliance officers verifying PII handling; developers whose apps send user data through LLMs

Prerequisites

  • kt CLI installed (first-run tutorial)
  • An OpenAI-compatible API key exported as OPENAI_API_KEY
  • curl and jq installed

Step 1: Create the Policy Configuration

Create policy-config.yaml with a pii-detector policy set to redact mode:

policy-config.yaml
pack:
name: pii-redaction
version: 0.1.0
enabled: true

providers:
targets:
- id: openai-primary
provider: openai
model: gpt-4o-mini
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY

policies:
chain:
- pii-detector
- audit-logger

policy:
pii-detector:
action: redact
healthcare_mode: false
pci_mode: true
detect_patterns: []
redaction:
marker_format: label
include_metadata: true
preserve_length: false
custom_markers: {}

audit-logger:
retention_days: 30

This configuration redacts built-in PII categories such as email, phone numbers, SSNs, IP addresses, and—because pci_mode is enabled—credit card data and related PCI markers.

Step 2: Validate and Start the Gateway

kt policy lint --file policy-config.yaml
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml

Expected startup output:

INFO keeptrusts::gateway Loaded declarative config pii-redaction@0.1.0
INFO keeptrusts::gateway Gateway ready

Step 3: Test with PII-Containing Input

Open a new terminal and send a request that contains PII:

curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Draft an email to John Smith at john.smith@example.com about his account 4111-1111-1111-1111. His phone is 555-867-5309 and SSN is 123-45-6789."
}
]
}' | jq '.choices[0].message.content'

Before (what the user sent)

Draft an email to John Smith at john.smith@example.com about his account
4111-1111-1111-1111. His phone is 555-867-5309 and SSN is 123-45-6789.

After (what the LLM provider received)

Draft an email to John Smith at [EMAIL] about his account
[CREDIT_CARD]. His phone is [PHONE] and SSN is [SSN].

The gateway redacted the detected PII spans before forwarding the request upstream. The same redaction engine also sanitizes matching output content before it reaches the caller.

Step 4: Inspect the Active Redaction Policy

Inspect the running config:

curl -s http://localhost:41002/keeptrusts/config | jq .

Look for:

  • pack.name: pii-redaction
  • a pii-detector policy block
  • the audit-logger policy in the chain

Step 5: Tune Detection Scope and Marker Style

The current pii-detector schema lets you tune behavior with explicit fields instead of legacy entities, apply_to, or sensitivity knobs.

Common tuning fields

FieldWhat it changesTypical use
pci_modeEnables credit card, CVV, and cardholder detectionPayment and checkout traffic
healthcare_modeEnables MRN, insurance ID, and NPI detectionClinical and healthcare workloads
detect_patternsAdds custom regex-based identifiersEmployee IDs, customer codes, case numbers
redaction.marker_formatChanges replacement stylelabel, asterisk, or partial

Example: extend detection for healthcare and internal employee IDs.

policy:
pii-detector:
action: redact
detect_patterns:
- '(?P<employee_id>EMP-\d{6})'
redaction:
marker_format: partial
include_metadata: true
preserve_length: true
custom_markers:
MRN: "[MEDICAL-RECORD-REDACTED]"
employee_id: "[EMPLOYEE-ID]"
pack:
name: pii-redaction-setup-example-2
version: 1.0.0
enabled: true
policies:
chain:
- pii-detector

With that configuration:

  • built-in healthcare identifiers are redacted inline
  • PCI data is still protected
  • EMP-123456 style identifiers are treated as redaction targets
  • partial masking preserves more visual context when appropriate

Step 6: Switch to Block Mode for Hard-Stop Traffic

If some traffic must never be forwarded when PII is present, switch the policy to action: block.

policy:
pii-detector:
action: block
redaction:
marker_format: label
include_metadata: true
pack:
name: pii-redaction-setup-example-3
version: 1.0.0
enabled: true
policies:
chain:
- pii-detector

When this mode is active, the gateway rejects the request with a policy-violation error instead of sanitizing and forwarding it.

Step 7: Optionally Verify Redaction in Decision Events

If your gateway reports into a Keeptrusts control plane, tail recent decision events:

kt events tail --json --limit 5 --event-type decision

Look for decision data that confirms the request was modified by pii-detector and that redaction metadata was captured when include_metadata: true is enabled.

Step 8: Test Output-Phase Redaction

The LLM might generate PII in its response. Verify output redaction works:

curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Generate a sample customer support email with realistic contact details."}
]
}' | jq '.choices[0].message.content'

Any PII the model generates in its response will be replaced with redaction tokens before reaching the caller.

For AI systems

  • Canonical terms: Keeptrusts gateway, pii-detector, redaction, healthcare_mode, pci_mode, detect_patterns, redaction.marker_format, custom_markers.
  • Config fields: policies.chain[], policy.pii-detector.action, policy.pii-detector.healthcare_mode, policy.pii-detector.pci_mode, policy.pii-detector.detect_patterns[], policy.pii-detector.redaction.*.
  • CLI commands: kt gateway run, kt policy lint --file policy-config.yaml, kt events tail --json --limit 5 --event-type decision.
  • Best next pages: Prompt Injection Defense, DLP & Data Classification, Custom Policy Chains.

For engineers

  • Prerequisites: kt CLI, OPENAI_API_KEY exported, curl and jq.
  • Validate: kt policy lint --file policy-config.yaml before starting the gateway.
  • Test: send a request with known PII and verify the content is replaced with labels such as [EMAIL], [PHONE], or [SSN].
  • Scope control: use pci_mode, healthcare_mode, and detect_patterns instead of legacy entity and sensitivity lists.
  • Redaction output: choose label, asterisk, or partial with the redaction block.

For leaders

  • PII redaction prevents personal data from being sent to third-party LLM providers, supporting GDPR and privacy-by-design requirements.
  • Redaction (not blocking) keeps the workflow functional while removing sensitive tokens.
  • Event logs record which entities were detected and redacted — useful for compliance audits.
  • Can be combined with DLP policies for layered data protection across multiple data categories.

Next steps

Troubleshooting

SymptomCauseFix
Credit card data not detectedpci_mode disabledSet pci_mode: true
MRNs or NPIs not detectedhealthcare_mode disabledSet healthcare_mode: true
Internal IDs not redactedMissing custom regexAdd a named pattern to detect_patterns
Marker style is too noisy or too opaqueredaction.marker_format not tunedSwitch between label, asterisk, or partial
Gateway returns 409 unexpectedlyBlock policy catching benign inputCheck which policy triggered via kt events tail