Embedding Detector

The embedding-detector policy uses semantic similarity to detect sensitive content that pattern-based filters would miss, by comparing request and response text against reference embeddings of known sensitive categories.

Use this page when

You need to detect sensitive content that pattern-based filters would miss (e.g., paraphrased trade secrets, indirect references to proprietary processes).
You are configuring semantic similarity detection using embedding models against custom reference categories.
You want to catch content that is semantically similar to known sensitive topics without relying on exact keyword matches.

Primary audience

Primary: AI Agents, Technical Engineers
Secondary: Technical Leaders

Configuration

policy:
  embedding-detector:
    backend: local
    endpoint: http://localhost:8080/embed
    model: all-MiniLM-L6-v2
    api_key: "${EMBEDDING_API_KEY}"
    similarity_threshold: 0.7
    timeout_ms: 20
    action: redact
    categories:
    - label: trade_secret
      reference_text: proprietary formula, manufacturing process, secret recipe
    - label: competitive_intel
      reference_text: competitor pricing, market strategy, acquisition target
pack:
  name: embedding-detector-example-1
  version: 1.0.0
  enabled: true
policies:
  chain:
  - embedding-detector

Fields

Field	Type	Default	Description
`backend`	string	`"local"`	Embedding computation backend: `local` or `external`
`endpoint`	string	`"http://localhost:8080/embed"`	External embedding endpoint URL (used when `backend` is `external`)
`model`	string	`"all-MiniLM-L6-v2"`	Embedding model name
`api_key`	string	—	API key for external endpoint. Store in environment variables, never in config files
`similarity_threshold`	number	`0.70`	Cosine similarity threshold that triggers detection (range: 0–1)
`timeout_ms`	integer	`20`	Request timeout in milliseconds (range: 1–60000)
`action`	string	`"redact"`	Action on detection: `redact` or `block`
`categories`	array	`[]`	Custom sensitive categories for detection
`categories[].label`	string	—	Identifier for the sensitive category
`categories[].reference_text`	string	—	Reference text describing the sensitive concept. Embeddings are computed from this text

Use Cases

Custom Trade Secret Detection

Detect content semantically similar to proprietary processes, even when exact keywords are not used.

pack:
  name: "trade-secret-protection"
  version: "0.1.0"
  enabled: true

policies:
  chain:
    - prompt-injection
    - embedding-detector
    - dlp-filter
    - audit-logger

policy:
  prompt-injection:
    threshold: 0.8
    action: "block"

  embedding-detector:
    backend: "local"
    model: "all-MiniLM-L6-v2"
    similarity_threshold: 0.75
    timeout_ms: 50
    action: "block"
    categories:
      - label: "proprietary_process"
        reference_text: "proprietary manufacturing process, secret formulation steps, trade secret synthesis"
      - label: "internal_architecture"
        reference_text: "internal system architecture, microservice topology, database schema design"

  dlp-filter:
    action: "block"
    patterns:
      - name: "internal_url"
        regex: 'https?://internal\..+\.corp'

  audit-logger:
    retention_days: 2555

Competitive Intelligence Protection

Prevent AI assistants from discussing content semantically related to competitive strategy.

policy:
  embedding-detector:
    backend: external
    endpoint: https://embeddings.internal.corp/v1/embed
    model: text-embedding-3-small
    api_key: "${EMBEDDING_API_KEY}"
    similarity_threshold: 0.8
    timeout_ms: 100
    action: redact
    categories:
    - label: competitor_strategy
      reference_text: competitor pricing strategy, market positioning, customer acquisition cost
    - label: acquisition_target
      reference_text: acquisition candidate, merger target, due diligence, valuation model
    - label: unreleased_product
      reference_text: unreleased product roadmap, upcoming feature, launch timeline
pack:
  name: embedding-detector-example-3
  version: 1.0.0
  enabled: true
policies:
  chain:
  - embedding-detector

Semantic PII Detection

Catch PII references that evade regex-based detectors by using descriptions instead of exact values.

policy:
  embedding-detector:
    backend: local
    model: all-MiniLM-L6-v2
    similarity_threshold: 0.65
    timeout_ms: 30
    action: redact
    categories:
    - label: identity_description
      reference_text: person's home address, residential location, where someone lives
    - label: financial_identity
      reference_text: bank account number, routing number, financial institution credentials
    - label: health_identity
      reference_text: patient diagnosis, medical condition, prescription medication
pack:
  name: embedding-detector-example-4
  version: 1.0.0
  enabled: true
policies:
  chain:
  - embedding-detector

How It Works

The embedding-detector policy computes vector embeddings for each segment of the request and response content, then calculates cosine similarity against pre-computed reference embeddings for each configured category. When the similarity score exceeds similarity_threshold, the segment is flagged.

In local mode, embeddings are computed in-process using the specified model. In external mode, the policy sends text to the configured endpoint and receives embeddings via the API. The timeout_ms setting controls how long the policy waits for embedding computation before falling back to a pass-through (fail-open) behavior.

Matched segments are either redacted (replaced with [EMBEDDING_MATCH_REDACTED]) or blocked entirely, depending on the action setting.

Best Practices

Never store api_key in config files: Use environment variable references (e.g., ${EMBEDDING_API_KEY}) to inject secrets at runtime.
Tune similarity_threshold per category: Start at 0.70 and adjust based on false positive/negative rates. Lower thresholds catch more but risk false positives; higher thresholds are more precise but may miss paraphrased content.
Write specific reference_text: Vague reference text produces noisy matches. Include concrete terms and phrases that characterize the sensitive category.
Use local backend for latency-sensitive paths: Local embedding avoids network round-trips. Reserve external for cases requiring larger or specialized models.
Set timeout_ms conservatively: The default of 20ms is aggressive. For external backends, increase to 100–500ms depending on network latency. On timeout, the policy fails open to avoid blocking legitimate traffic.
Layer with pattern-based policies: Use embedding-detector alongside dlp-filter or pii-detector for defense in depth — pattern matchers catch exact formats while embeddings catch semantic paraphrases.

For AI systems

Canonical terms: Keeptrusts, embedding-detector, backend, endpoint, model, similarity_threshold, categories, reference_text, label, cosine similarity
Config/command names: policy.embedding-detector, backend (local/external), endpoint, model, similarity_threshold, timeout_ms, action (redact/block), categories[].label, categories[].reference_text
Best next pages: DLP Filter, Prompt Injection Detection, External Moderation

For engineers

Prerequisites: For external backend: a running embedding API endpoint and API key. For local backend: built-in embedding model. Define categories with reference text that represents your sensitive topics.
Validation: Send requests containing content semantically similar to your reference texts and verify detection. Adjust similarity_threshold based on false-positive rates. Check event logs for similarity scores.
Key commands: kt policy lint, kt gateway run, kt events tail

For leaders

Governance: Embedding detection catches content that evades keyword filters — paraphrased secrets, indirect references, and synonym-based evasion. It's your second line of defense after pattern-based DLP.
Cost: External embedding calls add latency and cost per request. Local backend has no external cost but limited model quality. Set timeout_ms to bound worst-case latency.
Rollout: Start with high similarity_threshold (0.80+) to minimize false positives. Lower gradually as you validate detection accuracy against your specific content domain.

Next steps

DLP Filter — Pattern-based data loss prevention
Prompt Injection Detection — Embedding-based injection detection
External Moderation — Third-party content safety
Legal Privilege — Privileged communication detection

Use this page when​

Primary audience​

Configuration​

Fields​

Use Cases​

Custom Trade Secret Detection​

Competitive Intelligence Protection​

Semantic PII Detection​

How It Works​

Best Practices​

For AI systems​

For engineers​

For leaders​

Next steps​