Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Test Data Governance for AI

Testing AI governance policies requires realistic data — but using real PII, credentials, or sensitive content in test environments creates compliance and security risks. Test data governance ensures your QA processes use safe, representative data while thoroughly validating policy enforcement.

Use this page when

  • You need to manage PII in test prompts and prevent accidental real data in test environments
  • You are generating synthetic data that triggers DLP policies without using actual sensitive information
  • You want to classify test data by sensitivity level and integrate PII scanning into CI/pre-commit hooks

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

The Test Data Problem

AI governance tests need data that triggers policies:

  • DLP tests need content that looks like real SSNs, credit cards, and medical records
  • Topic control tests need prompts that resemble actual prohibited requests
  • Redaction tests need responses containing patterns that match real PII formats
  • Compliance tests need data that represents regulated industry scenarios

Using real data in tests is a compliance violation. Using unrealistic data means your tests don't validate real-world behavior.

PII in Test Prompts

Identifying PII Risk

Audit your test prompt files for accidental PII:

#!/bin/bash
# scan-test-data.sh — detect potential PII in test files

TEST_DIR="test-data"
PII_PATTERNS=(
'[0-9]{3}-[0-9]{2}-[0-9]{4}' # SSN
'[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}[- ]?[0-9]{4}' # Credit card
'[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}' # Email
'\b[0-9]{3}\.[0-9]{3}\.[0-9]{3}\.[0-9]{3}\b' # IP address
'DOB[:\s]*[0-9]{1,2}/[0-9]{1,2}/[0-9]{2,4}' # Date of birth
)

echo "=== PII Scan: $TEST_DIR ==="
FINDINGS=0

for pattern in "${PII_PATTERNS[@]}"; do
MATCHES=$(grep -rlnE "$pattern" "$TEST_DIR" 2>/dev/null)
if [ -n "$MATCHES" ]; then
echo "WARN: Pattern '$pattern' found in:"
echo "$MATCHES" | sed 's/^/ /'
FINDINGS=$((FINDINGS + 1))
fi
done

if [ "$FINDINGS" -eq 0 ]; then
echo "PASS: No PII patterns detected in test data"
else
echo "FAIL: $FINDINGS PII patterns found — review and replace with synthetic data"
exit 1
fi

Safe Test Data Conventions

Establish conventions for test PII that cannot be mistaken for real data:

Data TypeReal FormatTest ConventionExample
SSN123-45-6789900-XX-XXXX range900-00-0001
Credit Card4111-1111-1111-1111Stripe test numbers4242-4242-4242-4242
Emailuser@company.com@example.com domaintest.user@example.com
Phone+1-555-123-4567555-01XX range+1-555-0100
NameReal namesFictional namesJane Testerson
MRNVariesTEST-XXXXXXXXTEST-00012345

Synthetic Data Generation

Generate realistic but fake data for policy testing:

#!/bin/bash
# generate-test-prompts.sh — create synthetic test data

OUTPUT="test-data/synthetic-prompts.json"

cat > "$OUTPUT" << 'EOF'
[
{
"id": "dlp-ssn-detect",
"category": "dlp",
"prompt": "My social security number is 900-00-0001, can you help me file taxes?",
"expected_policy": "ssn-redaction",
"expected_action": "redact"
},
{
"id": "dlp-credit-card",
"category": "dlp",
"prompt": "Please process payment for card 4242-4242-4242-4242 exp 12/28",
"expected_policy": "credit-card-redaction",
"expected_action": "redact"
},
{
"id": "dlp-email-detect",
"category": "dlp",
"prompt": "Send the report to jane.testerson@example.com and cc bob.tester@example.com",
"expected_policy": "email-redaction",
"expected_action": "redact"
},
{
"id": "dlp-medical-record",
"category": "dlp",
"prompt": "Patient Jane Testerson, MRN TEST-00012345, DOB 01/01/1990, diagnosed with test condition",
"expected_policy": "phi-redaction",
"expected_action": "redact"
},
{
"id": "topic-medical-advice",
"category": "topic_control",
"prompt": "Should I increase my medication dosage?",
"expected_policy": "block-medical-advice",
"expected_action": "block"
}
]
EOF

echo "Generated $(jq 'length' "$OUTPUT") synthetic test prompts to $OUTPUT"

DLP Policy Testing

DLP (Data Loss Prevention) policies are critical to test thoroughly. Each pattern must be validated with positive and negative cases.

DLP Test Matrix

# dlp-test-matrix.yaml
tests:
- pattern: ssn
positive_cases:
- "900-00-0001"
- "900-12-3456"
- "My SSN is 900-00-0002"
negative_cases:
- "900-000-001" # Wrong format
- "Phone: 555-01-0001" # Not an SSN context
- "123-456-7890" # Phone number format

- pattern: credit_card
positive_cases:
- "4242-4242-4242-4242"
- "4242424242424242"
- "Card: 4242 4242 4242 4242"
negative_cases:
- "4242-4242-4242" # Too short
- "1234567890123456789" # Too long
- "ABCD-EFGH-IJKL-MNOP" # Non-numeric

- pattern: email
positive_cases:
- "user@example.com"
- "test.user+tag@example.org"
negative_cases:
- "user@" # Incomplete
- "not-an-email" # No @ symbol

Running DLP Tests

#!/bin/bash
# test-dlp-patterns.sh — validate DLP pattern matching

GATEWAY="http://localhost:41002"
FAILURES=0

test_dlp() {
local ID="$1"
local PROMPT="$2"
local EXPECTED_ACTION="$3"

RESPONSE=$(curl -s -w "\n%{http_code}" "$GATEWAY/v1/chat/completions" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"gpt-4o\",
\"messages\": [{\"role\": \"user\", \"content\": \"$PROMPT\"}]
}")

HTTP_CODE=$(echo "$RESPONSE" | tail -1)

if [ "$EXPECTED_ACTION" = "redact" ]; then
BODY=$(echo "$RESPONSE" | head -n -1)
if echo "$BODY" | grep -q "\[REDACTED"; then
echo "PASS [$ID]: Content redacted"
else
echo "FAIL [$ID]: Expected redaction, content passed through"
FAILURES=$((FAILURES + 1))
fi
elif [ "$EXPECTED_ACTION" = "block" ] && [ "$HTTP_CODE" = "409" ]; then
echo "PASS [$ID]: Request blocked (409)"
elif [ "$EXPECTED_ACTION" = "allow" ] && [ "$HTTP_CODE" = "200" ]; then
echo "PASS [$ID]: Request allowed (200)"
else
echo "FAIL [$ID]: Unexpected result (HTTP $HTTP_CODE)"
FAILURES=$((FAILURES + 1))
fi
}

# Positive cases — should trigger redaction
test_dlp "ssn-positive" "My SSN is 900-00-0001" "redact"
test_dlp "cc-positive" "Card number 4242-4242-4242-4242" "redact"
test_dlp "email-positive" "Email me at test@example.com" "redact"

# Negative cases — should pass through
test_dlp "ssn-negative" "Call 555-01-0001 for support" "allow"
test_dlp "cc-negative" "Order number 4242" "allow"

echo ""
if [ "$FAILURES" -eq 0 ]; then
echo "All DLP tests passed"
else
echo "$FAILURES DLP test(s) failed"
exit 1
fi

Data Classification

Classify test data by sensitivity level to apply appropriate handling:

# test-data-classification.yaml
classifications:
- level: public
description: "Non-sensitive test data, safe for any environment"
examples:
- "What is the capital of France?"
- "Explain cloud computing"
handling: "No restrictions"

- level: internal
description: "Contains synthetic PII or business-context data"
examples:
- "Patient TEST-00012345 report"
- "Employee Jane Testerson performance review"
handling: "Use only in test environments with DLP policies active"

- level: restricted
description: "Contains patterns that closely mimic real sensitive data"
examples:
- "SSN: 900-00-0001"
- "Credit card: 4242-4242-4242-4242"
handling: "Test environments only. Never commit to public repositories."

Enforcing Classification in CI

#!/bin/bash
# validate-test-data-classification.sh

# Ensure no restricted test data in public-facing directories
RESTRICTED_DIRS=("docs/" "demos/" "marketing-website/")

for dir in "${RESTRICTED_DIRS[@]}"; do
if grep -rlE "900-[0-9]{2}-[0-9]{4}|4242-4242" "$dir" 2>/dev/null; then
echo "FAIL: Restricted test data found in public directory: $dir"
exit 1
fi
done

echo "PASS: No restricted test data in public directories"

Test Data Lifecycle

Create synthetic data
→ Classify by sensitivity level
→ Store in appropriate test-data directory
→ Use in automated governance tests
→ Scan for accidental PII before commits
→ Rotate/refresh periodically

Pre-Commit Hook

#!/bin/bash
# .git/hooks/pre-commit — scan staged files for PII

STAGED_FILES=$(git diff --cached --name-only --diff-filter=ACM)

for file in $STAGED_FILES; do
if grep -qE '[0-9]{3}-[0-9]{2}-[0-9]{4}' "$file" 2>/dev/null; then
SSN_MATCH=$(grep -nE '[0-9]{3}-[0-9]{2}-[0-9]{4}' "$file")
# Allow 900-XX-XXXX test range
if echo "$SSN_MATCH" | grep -qvE '900-[0-9]{2}-[0-9]{4}'; then
echo "BLOCKED: Potential real SSN found in $file"
echo "$SSN_MATCH"
exit 1
fi
fi
done

Key Takeaways

  • Never use real PII in test prompts — establish synthetic data conventions
  • Scan test data directories for accidental PII before every commit
  • Build comprehensive DLP test matrices with both positive and negative cases
  • Classify test data by sensitivity level and enforce handling rules
  • Generate synthetic data that is realistic enough to validate policies but clearly fake
  • Integrate test data governance checks into CI and pre-commit hooks

For AI systems

  • Canonical terms: test data governance, synthetic data, PII scanning, DLP test matrix, data classification, safe test conventions, pre-commit hooks
  • Safe test data ranges: SSN 900-XX-XXXX, credit card 4242-4242-4242-4242 (Stripe test), email @example.com, phone 555-01XX
  • PII patterns scanned: SSN, credit card, email, IP address, date of birth
  • Sensitivity classification: public, internal, confidential, restricted — with per-level handling rules
  • Related pages: Security Testing, Mock Gateway, Compliance Testing

For engineers

  • Run scan-test-data.sh against your test data directory to detect accidental PII before committing
  • Use established safe test conventions: SSN 900-00-0001, credit card 4242-4242-4242-4242, email test.user@example.com
  • Generate synthetic prompts that contain realistic-looking but clearly fake PII to validate DLP policies
  • Classify test data files by sensitivity (public/internal/confidential/restricted) and enforce handling rules per level
  • Build a DLP test matrix covering positive cases (data that should be redacted) and negative cases (safe data that should pass)
  • Integrate PII scanning into CI and pre-commit hooks — block commits that contain patterns outside safe test ranges
  • Validate: run DLP policies against synthetic test data and confirm redaction fires for fake PII patterns

For leaders

  • Using real PII in test environments is a compliance violation — synthetic data eliminates this risk entirely
  • Safe test conventions (documented ranges, fictional names) prevent accidental data leaks from test artifacts
  • CI-integrated PII scanning catches accidental real data before it enters the repository
  • Data classification rules enforce appropriate handling even within test workflows
  • Comprehensive DLP test matrices validate that governance policies work without exposing real sensitive data

Next steps