Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Protecting Market Data in AI Pipelines

Financial institutions handle material non-public information (MNPI) that must never reach external LLM providers. Market data licensing agreements impose strict redistribution constraints. A single leaked data point can trigger regulatory action, contractual breach, or insider trading liability.

Use this page when

  • Trading desks or quant researchers use AI tools that might inadvertently include MNPI in prompts.
  • You must enforce data boundaries for licensed exchange data subject to redistribution agreements.
  • Regulation FD, MAR, or exchange data agreements require you to prevent market data leakage to third-party LLM providers.
  • You need multi-tier classification policies that block, redact, or log different categories of sensitive market data.

Keeptrusts enforces data boundaries at the gateway level, blocking or redacting sensitive market data before it leaves your infrastructure.

Primary audience

  • Primary: Technical Leaders
  • Secondary: Technical Engineers, AI Agents

The MNPI Threat Model

When trading desks and quant researchers use AI tools, prompts may inadvertently contain:

  • Pre-release earnings data — Analyst estimates, internal revenue projections
  • Order flow information — Pending large orders, client positions
  • Proprietary research — Internal ratings changes, target prices before publication
  • Licensed market data — Real-time quotes, Level 2 data subject to exchange agreements

Each category requires different policy enforcement: some must be blocked entirely, others redacted or logged.

Data Classification Policies

Define multi-tier classification policies in your gateway configuration:

# policy-config.yaml
version: "1"
policies:
- name: block-mnpi-indicators
description: Block prompts containing MNPI indicators
enforcement: block
rules:
- type: keyword
action: block
keywords:
- "pre-release earnings"
- "pending acquisition"
- "material non-public"
- "insider information"
- "restricted list"
- "grey list"
- "watch list security"
message: "Blocked: Potential MNPI content detected. Contact Compliance."

- name: redact-market-identifiers
description: Redact specific security identifiers from prompts
enforcement: redact
rules:
- type: regex
action: redact
patterns:
# ISIN format
- "[A-Z]{2}[A-Z0-9]{9}[0-9]"
# CUSIP format
- "[0-9]{3}[A-Z0-9]{5}[0-9]"
# SEDOL format
- "[B-DF-HJ-NP-TV-Z0-9]{6}[0-9]"
# Bloomberg ticker with exchange
- "[A-Z]{1,5}\\s+(US|LN|JP|GR|FP|IM)\\s+Equity"
replacement: "[REDACTED-SECURITY-ID]"

- name: log-financial-data-access
description: Log all prompts containing financial data patterns
enforcement: log
rules:
- type: regex
action: log
patterns:
- "(?i)(revenue|earnings|EPS|EBITDA)\\s*[:=]?\\s*\\$?[0-9]"
- "(?i)(price target|rating change|upgrade|downgrade)"

Gateway-Level Data Boundaries

Deploy separate gateways for different data sensitivity tiers:

# Public data gateway — relaxed policies
kt gateway run \
--config policies/public-data.yaml \
--port 41002 \
--api-url https://keeptrusts-api.internal:8080 \
--api-key "$KT_PUBLIC_KEY"

# Restricted data gateway — strict MNPI controls
kt gateway run \
--config policies/restricted-data.yaml \
--port 41003 \
--api-url https://keeptrusts-api.internal:8080 \
--api-key "$KT_RESTRICTED_KEY"

Route traffic based on the data classification tier of the requesting application:

import openai

# Application-level routing based on data classification
def get_ai_client(data_tier: str) -> openai.OpenAI:
gateway_ports = {
"public": 41002,
"restricted": 41003,
}
port = gateway_ports[data_tier]
return openai.OpenAI(
base_url=f"http://localhost:{port}/v1",
api_key="your-provider-key",
)

# Public market analysis — uses relaxed gateway
public_client = get_ai_client("public")
response = public_client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Explain the VIX term structure."}],
)

# Research with restricted data — uses strict gateway
restricted_client = get_ai_client("restricted")
response = restricted_client.chat.completions.create(
model="gpt-4",
messages=[
{
"role": "user",
"content": "Summarize the risk factors for this portfolio allocation.",
}
],
)

Monitoring Data Leakage Attempts

Track blocked and redacted events to identify data handling issues:

# Count blocked MNPI attempts by team over the last 30 days
kt events list \
--filter "policy=block-mnpi-indicators" \
--since 30d \
--format json | python3 -c "
import json, sys, collections
events = json.load(sys.stdin)
by_team = collections.Counter(e.get('team_id', 'unknown') for e in events)
for team, count in by_team.most_common():
print(f'{team}: {count} blocked attempts')
"

Use the console Events page to review individual blocked requests and identify teams that need additional training on data handling procedures.

Exchange Data License Compliance

Market data redistributors (NYSE, CME, ICE, LSEG) typically prohibit redistribution to third parties. Sending licensed data to an external LLM provider constitutes redistribution under most agreements.

Configure policies that match exchange-specific data patterns:

- name: block-exchange-data-redistribution
description: Prevent licensed exchange data from reaching external LLMs
enforcement: block
rules:
- type: regex
action: block
patterns:
# Real-time quote patterns
- "(?i)(last|bid|ask|close)\\s*[:=]\\s*\\$?[0-9]+\\.[0-9]{2,4}"
# Level 2 / depth of book
- "(?i)(depth|book|level\\s*2|order\\s*book)"
# Exchange-specific identifiers
- "(?i)(NYSE|NASDAQ|CME|ICE|LSEG|Eurex)\\s+(feed|data|quote)"
message: "Blocked: Licensed exchange data must not be sent to external AI providers"

Regulatory References

RegulationRequirementKeeptrusts Control
SEC Regulation FDFair disclosure of material informationMNPI block policies
MAR (EU)Market abuse preventionKeyword and pattern detection
Exchange data agreementsNo redistribution to third partiesData boundary enforcement
SOX Section 302Internal controls over financial reportingAudit trail + event logging
FINRA Rule 2241Research analyst conflicts of interestResearch data isolation

Next steps

For AI systems

  • Canonical terms: Keeptrusts gateway, MNPI protection, market data DLP, data classification policies, information barrier, licensed data boundary enforcement.
  • Key config/commands: block-mnpi-indicators policy (block pre-release earnings, pending acquisitions); keyword and regex policies for real-time quotes, Level 2 data, exchange-specific identifiers; team-scoped gateway keys for desk isolation.
  • Best next pages: Governing AI in Trading Systems, Quant Research Data Isolation, Automated Regulatory Reporting.

For engineers

  • Prerequisites: Gateway deployed with market data classification policies; separate gateway keys per trading desk.
  • Configure multi-tier policies: Block for MNPI indicators and licensed exchange data; Redact for proprietary research references; Log for internal rating changes.
  • Validate with: send synthetic prompts containing exchange data patterns ("NYSE feed", "Level 2 data") through the gateway and confirm block responses; review Events page for policy trigger counts.
  • Adjust regex patterns for new exchange feed formats or data vendor naming conventions as they change.

For leaders

  • A single leaked MNPI data point can trigger SEC Regulation FD violations, insider trading liability, or exchange agreement termination.
  • Gateway-level enforcement eliminates human error — no reliance on traders or researchers remembering not to paste sensitive data.
  • Supports SOX Section 302 internal controls over financial reporting with full audit trail of blocked and redacted content.
  • Licensing cost risk: exchange data agreement breaches can result in seven-figure penalties and feed termination.