Skip to main content
Browse docs

Embedding Detector

The embedding-detector policy uses semantic similarity to detect sensitive content that pattern-based filters would miss, by comparing request and response text against reference embeddings of known sensitive categories.

Use this page when

  • You need to detect sensitive content that pattern-based filters would miss (e.g., paraphrased trade secrets, indirect references to proprietary processes).
  • You are configuring semantic similarity detection using embedding models against custom reference categories.
  • You want to catch content that is semantically similar to known sensitive topics without relying on exact keyword matches.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

Configuration

policy:
embedding-detector:
backend: local
endpoint: http://localhost:8080/embed
model: all-MiniLM-L6-v2
api_key: "${EMBEDDING_API_KEY}"
similarity_threshold: 0.7
timeout_ms: 20
action: redact
categories:
- label: trade_secret
reference_text: proprietary formula, manufacturing process, secret recipe
- label: competitive_intel
reference_text: competitor pricing, market strategy, acquisition target
pack:
name: embedding-detector-example-1
version: 1.0.0
enabled: true
policies:
chain:
- embedding-detector

Fields

FieldTypeDefaultDescription
backendstring"local"Embedding computation backend: local or external
endpointstring"http://localhost:8080/embed"External embedding endpoint URL (used when backend is external)
modelstring"all-MiniLM-L6-v2"Embedding model name
api_keystringAPI key for external endpoint. Store in environment variables, never in config files
similarity_thresholdnumber0.70Cosine similarity threshold that triggers detection (range: 0–1)
timeout_msinteger20Request timeout in milliseconds (range: 1–60000)
actionstring"redact"Action on detection: redact or block
categoriesarray[]Custom sensitive categories for detection
categories[].labelstringIdentifier for the sensitive category
categories[].reference_textstringReference text describing the sensitive concept. Embeddings are computed from this text

Use Cases

Custom Trade Secret Detection

Detect content semantically similar to proprietary processes, even when exact keywords are not used.

pack:
name: "trade-secret-protection"
version: "0.1.0"
enabled: true

policies:
chain:
- prompt-injection
- embedding-detector
- dlp-filter
- audit-logger

policy:
prompt-injection:
threshold: 0.8
action: "block"

embedding-detector:
backend: "local"
model: "all-MiniLM-L6-v2"
similarity_threshold: 0.75
timeout_ms: 50
action: "block"
categories:
- label: "proprietary_process"
reference_text: "proprietary manufacturing process, secret formulation steps, trade secret synthesis"
- label: "internal_architecture"
reference_text: "internal system architecture, microservice topology, database schema design"

dlp-filter:
action: "block"
patterns:
- name: "internal_url"
regex: 'https?://internal\..+\.corp'

audit-logger:
retention_days: 2555

Competitive Intelligence Protection

Prevent AI assistants from discussing content semantically related to competitive strategy.

policy:
embedding-detector:
backend: external
endpoint: https://embeddings.internal.corp/v1/embed
model: text-embedding-3-small
api_key: "${EMBEDDING_API_KEY}"
similarity_threshold: 0.8
timeout_ms: 100
action: redact
categories:
- label: competitor_strategy
reference_text: competitor pricing strategy, market positioning, customer acquisition cost
- label: acquisition_target
reference_text: acquisition candidate, merger target, due diligence, valuation model
- label: unreleased_product
reference_text: unreleased product roadmap, upcoming feature, launch timeline
pack:
name: embedding-detector-example-3
version: 1.0.0
enabled: true
policies:
chain:
- embedding-detector

Semantic PII Detection

Catch PII references that evade regex-based detectors by using descriptions instead of exact values.

policy:
embedding-detector:
backend: local
model: all-MiniLM-L6-v2
similarity_threshold: 0.65
timeout_ms: 30
action: redact
categories:
- label: identity_description
reference_text: person's home address, residential location, where someone lives
- label: financial_identity
reference_text: bank account number, routing number, financial institution credentials
- label: health_identity
reference_text: patient diagnosis, medical condition, prescription medication
pack:
name: embedding-detector-example-4
version: 1.0.0
enabled: true
policies:
chain:
- embedding-detector

How It Works

The embedding-detector policy computes vector embeddings for each segment of the request and response content, then calculates cosine similarity against pre-computed reference embeddings for each configured category. When the similarity score exceeds similarity_threshold, the segment is flagged.

In local mode, embeddings are computed in-process using the specified model. In external mode, the policy sends text to the configured endpoint and receives embeddings via the API. The timeout_ms setting controls how long the policy waits for embedding computation before falling back to a pass-through (fail-open) behavior.

Matched segments are either redacted (replaced with [EMBEDDING_MATCH_REDACTED]) or blocked entirely, depending on the action setting.

Best Practices

  • Never store api_key in config files: Use environment variable references (e.g., ${EMBEDDING_API_KEY}) to inject secrets at runtime.
  • Tune similarity_threshold per category: Start at 0.70 and adjust based on false positive/negative rates. Lower thresholds catch more but risk false positives; higher thresholds are more precise but may miss paraphrased content.
  • Write specific reference_text: Vague reference text produces noisy matches. Include concrete terms and phrases that characterize the sensitive category.
  • Use local backend for latency-sensitive paths: Local embedding avoids network round-trips. Reserve external for cases requiring larger or specialized models.
  • Set timeout_ms conservatively: The default of 20ms is aggressive. For external backends, increase to 100–500ms depending on network latency. On timeout, the policy fails open to avoid blocking legitimate traffic.
  • Layer with pattern-based policies: Use embedding-detector alongside dlp-filter or pii-detector for defense in depth — pattern matchers catch exact formats while embeddings catch semantic paraphrases.

For AI systems

  • Canonical terms: Keeptrusts, embedding-detector, backend, endpoint, model, similarity_threshold, categories, reference_text, label, cosine similarity
  • Config/command names: policy.embedding-detector, backend (local/external), endpoint, model, similarity_threshold, timeout_ms, action (redact/block), categories[].label, categories[].reference_text
  • Best next pages: DLP Filter, Prompt Injection Detection, External Moderation

For engineers

  • Prerequisites: For external backend: a running embedding API endpoint and API key. For local backend: built-in embedding model. Define categories with reference text that represents your sensitive topics.
  • Validation: Send requests containing content semantically similar to your reference texts and verify detection. Adjust similarity_threshold based on false-positive rates. Check event logs for similarity scores.
  • Key commands: kt policy lint, kt gateway run, kt events tail

For leaders

  • Governance: Embedding detection catches content that evades keyword filters — paraphrased secrets, indirect references, and synonym-based evasion. It's your second line of defense after pattern-based DLP.
  • Cost: External embedding calls add latency and cost per request. Local backend has no external cost but limited model quality. Set timeout_ms to bound worst-case latency.
  • Rollout: Start with high similarity_threshold (0.80+) to minimize false positives. Lower gradually as you validate detection accuracy against your specific content domain.

Next steps