Protecting Market Data in AI Pipelines

Financial institutions handle material non-public information (MNPI) that must never reach external LLM providers. Market data licensing agreements impose strict redistribution constraints. A single leaked data point can trigger regulatory action, contractual breach, or insider trading liability.

Use this page when

Trading desks or quant researchers use AI tools that might inadvertently include MNPI in prompts.
You must enforce data boundaries for licensed exchange data subject to redistribution agreements.
Regulation FD, MAR, or exchange data agreements require you to prevent market data leakage to third-party LLM providers.
You need multi-tier classification policies that block, redact, or log different categories of sensitive market data.

Keeptrusts enforces data boundaries at the gateway level, blocking or redacting sensitive market data before it leaves your infrastructure.

Primary audience

Primary: Technical Leaders
Secondary: Technical Engineers, AI Agents

The MNPI Threat Model

When trading desks and quant researchers use AI tools, prompts may inadvertently contain:

Pre-release earnings data — Analyst estimates, internal revenue projections
Order flow information — Pending large orders, client positions
Proprietary research — Internal ratings changes, target prices before publication
Licensed market data — Real-time quotes, Level 2 data subject to exchange agreements

Each category requires different policy enforcement: some must be blocked entirely, others redacted or logged.

Data Classification Policies

Define multi-tier classification policies in your gateway configuration:

# policy-config.yaml
version: "1"
policies:
  - name: block-mnpi-indicators
    description: Block prompts containing MNPI indicators
    enforcement: block
    rules:
      - type: keyword
        action: block
        keywords:
          - "pre-release earnings"
          - "pending acquisition"
          - "material non-public"
          - "insider information"
          - "restricted list"
          - "grey list"
          - "watch list security"
        message: "Blocked: Potential MNPI content detected. Contact Compliance."

  - name: redact-market-identifiers
    description: Redact specific security identifiers from prompts
    enforcement: redact
    rules:
      - type: regex
        action: redact
        patterns:
          # ISIN format
          - "[A-Z]{2}[A-Z0-9]{9}[0-9]"
          # CUSIP format
          - "[0-9]{3}[A-Z0-9]{5}[0-9]"
          # SEDOL format
          - "[B-DF-HJ-NP-TV-Z0-9]{6}[0-9]"
          # Bloomberg ticker with exchange
          - "[A-Z]{1,5}\\s+(US|LN|JP|GR|FP|IM)\\s+Equity"
        replacement: "[REDACTED-SECURITY-ID]"

  - name: log-financial-data-access
    description: Log all prompts containing financial data patterns
    enforcement: log
    rules:
      - type: regex
        action: log
        patterns:
          - "(?i)(revenue|earnings|EPS|EBITDA)\\s*[:=]?\\s*\\$?[0-9]"
          - "(?i)(price target|rating change|upgrade|downgrade)"

Gateway-Level Data Boundaries

Deploy separate gateways for different data sensitivity tiers:

# Public data gateway — relaxed policies
kt gateway run \
  --config policies/public-data.yaml \
  --port 41002 \
  --api-url https://keeptrusts-api.internal:8080 \
  --api-key "$KT_PUBLIC_KEY"

# Restricted data gateway — strict MNPI controls
kt gateway run \
  --config policies/restricted-data.yaml \
  --port 41003 \
  --api-url https://keeptrusts-api.internal:8080 \
  --api-key "$KT_RESTRICTED_KEY"

Route traffic based on the data classification tier of the requesting application:

import openai

# Application-level routing based on data classification
def get_ai_client(data_tier: str) -> openai.OpenAI:
    gateway_ports = {
        "public": 41002,
        "restricted": 41003,
    }
    port = gateway_ports[data_tier]
    return openai.OpenAI(
        base_url=f"http://localhost:{port}/v1",
        api_key="your-provider-key",
    )

# Public market analysis — uses relaxed gateway
public_client = get_ai_client("public")
response = public_client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Explain the VIX term structure."}],
)

# Research with restricted data — uses strict gateway
restricted_client = get_ai_client("restricted")
response = restricted_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "user",
            "content": "Summarize the risk factors for this portfolio allocation.",
        }
    ],
)

Monitoring Data Leakage Attempts

Track blocked and redacted events to identify data handling issues:

# Count blocked MNPI attempts by team over the last 30 days
kt events list \
  --filter "policy=block-mnpi-indicators" \
  --since 30d \
  --format json | python3 -c "
import json, sys, collections
events = json.load(sys.stdin)
by_team = collections.Counter(e.get('team_id', 'unknown') for e in events)
for team, count in by_team.most_common():
    print(f'{team}: {count} blocked attempts')
"

Use the console Events page to review individual blocked requests and identify teams that need additional training on data handling procedures.

Exchange Data License Compliance

Market data redistributors (NYSE, CME, ICE, LSEG) typically prohibit redistribution to third parties. Sending licensed data to an external LLM provider constitutes redistribution under most agreements.

Configure policies that match exchange-specific data patterns:

  - name: block-exchange-data-redistribution
    description: Prevent licensed exchange data from reaching external LLMs
    enforcement: block
    rules:
      - type: regex
        action: block
        patterns:
          # Real-time quote patterns
          - "(?i)(last|bid|ask|close)\\s*[:=]\\s*\\$?[0-9]+\\.[0-9]{2,4}"
          # Level 2 / depth of book
          - "(?i)(depth|book|level\\s*2|order\\s*book)"
          # Exchange-specific identifiers
          - "(?i)(NYSE|NASDAQ|CME|ICE|LSEG|Eurex)\\s+(feed|data|quote)"
        message: "Blocked: Licensed exchange data must not be sent to external AI providers"

Regulatory References

Regulation	Requirement	Keeptrusts Control
SEC Regulation FD	Fair disclosure of material information	MNPI block policies
MAR (EU)	Market abuse prevention	Keyword and pattern detection
Exchange data agreements	No redistribution to third parties	Data boundary enforcement
SOX Section 302	Internal controls over financial reporting	Audit trail + event logging
FINRA Rule 2241	Research analyst conflicts of interest	Research data isolation

Next steps

Governing AI in Trading Systems — Comprehensive trading AI controls
Quant Research Data Isolation — Chinese wall enforcement
Automated Regulatory Reporting — Evidence export for compliance

For AI systems

Canonical terms: Keeptrusts gateway, MNPI protection, market data DLP, data classification policies, information barrier, licensed data boundary enforcement.
Key config/commands: block-mnpi-indicators policy (block pre-release earnings, pending acquisitions); keyword and regex policies for real-time quotes, Level 2 data, exchange-specific identifiers; team-scoped gateway keys for desk isolation.
Best next pages: Governing AI in Trading Systems, Quant Research Data Isolation, Automated Regulatory Reporting.

For engineers

Prerequisites: Gateway deployed with market data classification policies; separate gateway keys per trading desk.
Configure multi-tier policies: Block for MNPI indicators and licensed exchange data; Redact for proprietary research references; Log for internal rating changes.
Validate with: send synthetic prompts containing exchange data patterns ("NYSE feed", "Level 2 data") through the gateway and confirm block responses; review Events page for policy trigger counts.
Adjust regex patterns for new exchange feed formats or data vendor naming conventions as they change.

For leaders

A single leaked MNPI data point can trigger SEC Regulation FD violations, insider trading liability, or exchange agreement termination.
Gateway-level enforcement eliminates human error — no reliance on traders or researchers remembering not to paste sensitive data.
Supports SOX Section 302 internal controls over financial reporting with full audit trail of blocked and redacted content.
Licensing cost risk: exchange data agreement breaches can result in seven-figure penalties and feed termination.

Use this page when​

Primary audience​

The MNPI Threat Model​

Data Classification Policies​

Gateway-Level Data Boundaries​

Monitoring Data Leakage Attempts​

Exchange Data License Compliance​

Regulatory References​

Next steps​

For AI systems​

For engineers​

For leaders​