Patent Research AI: Protecting Invention Details During Prior Art Search
Patent research is one of the highest-value legal AI workflows because it turns large patent and literature sets into something a practitioner can triage quickly. It is also one of the easiest ways to leak the very thing you are trying to protect. Teams paste unpublished invention summaries, draft claim language, prototype codenames, inventor details, and outside-counsel notes into a general-purpose assistant, then ask for novelty analysis or prior art comparisons. That gives the model enough context to help, but it also expands the exposure surface for confidential invention strategy.
Keeptrusts gives patent teams a better pattern: minimize invention identifiers before the provider sees them, route only to approved targets with strict declared handling guarantees, require grounded citations in the response, and preserve reviewable evidence when counsel needs to show how the search was conducted. The important detail is that no single policy solves this alone. Input protection comes from pii-detector, dlp-filter, and Data Routing Policy. Output quality comes from Citation Verifier. A narrow legal backstop comes from Legal Privilege when the assistant starts echoing privileged markers.
Use this page when
- You are deploying AI for prior art search, patentability triage, invention disclosure review, or claim-chart drafting.
- You need to protect unpublished invention details while still allowing researchers and counsel to use AI for search acceleration.
- You want a practical pattern that connects Legal, Legal Technology, Data Routing Policy, and Citation Verifier.
Primary audience
- Primary: Technical Leaders
- Secondary: Technical Engineers, AI Agents
The problem
Prior art search looks harmless on the surface because the output is usually external material: existing patents, papers, standards, and public technical disclosures. The risky part is the prompt. Researchers usually do not ask the model a generic question such as "find prior art for a fluid valve." They paste the unpublished embodiment description, the working title from the invention disclosure, the feature list they think is novel, and often the draft claim they are trying to defend. Once that happens, the system is no longer only searching public art. It is also processing the applicant's confidential strategy.
That exposure creates three operational problems. The first is confidentiality. Prototype names, claim labels, project codes, and inventor details do not belong in an unconstrained upstream request. The second is quality. Patent teams do not just need plausible prior art. They need a response they can trace back to actual sources instead of a polished synthesis that mixes real references with invented citations. The third is reviewability. When a search result influences filing scope, claim amendments, or abandonment decisions, counsel eventually needs to reconstruct how the assistant was used.
Legal teams also need to be precise about what each control actually does. Legal Privilege is useful, but it is an output-phase marker check, not an inbound invention scrubber. It can stop the assistant from returning phrases such as "privileged and confidential" or "for legal review only," but it does not sanitize your prompt before the provider sees it. If you want to protect invention details during search, the primary controls have to sit earlier in the flow.
The solution
The practical pattern is to split patent research into a dedicated lane with aggressive input minimization and grounded output requirements. Start with pii-detector for inventor names, email addresses, reference numbers, and any structured identifiers that appear in invention disclosure templates. Then use dlp-filter for organization-specific content such as prototype codenames, internal claim labels, lab system identifiers, and matter numbers that only mean something inside your patent program.
After the prompt is minimized, use Data Routing Policy to restrict provider choice to targets that declare the retention and handling guarantees your legal team can accept. This is where teams make the common mistake of assuming a prompt rewrite is enough. It is not. The route itself should refuse non-compliant targets. For cross-border patent practice, this also supports jurisdiction-specific search lanes backed by the data-handling guarantees documented in Data Residency.
On the response side, use Citation Verifier so the assistant cannot hand back a tidy novelty narrative without tying it to real patents, publications, or supplied search context. That matters because patent teams often act on the first coherent explanation they receive. A grounded, source-matched result is much safer than an elegant but unverified one. Add legal-privilege as a narrow backstop for privileged markers, and keep audit-logger in the chain so the route clearly advertises that reviewable evidence is part of the control set.
Implementation
The following pack is a good starting point for a patent-search route used by prosecution counsel or an internal IP team. It protects invention identifiers on input, forces provider-side handling guarantees, and blocks ungrounded citation-heavy output.
pack:
name: patent-research-zdr-lane
version: 1.0.0
enabled: true
providers:
targets:
- id: patent-search-zdr
provider: openai
model: gpt-5.4-mini-mini
secret_key_ref:
env: OPENAI_API_KEY
data_policy:
zero_data_retention: true
training_opt_out: true
retention_days: 0
in_memory_only: true
sanitized: true
accepts_tokenized_input: true
allow_internet_egress: false
local_only_processing: true
policies:
chain:
- pii-detector
- dlp-filter
- data-routing-policy
- citation-verifier
- legal-privilege
- audit-logger
policy:
pii-detector:
action: redact
detect_patterns:
- 'INV-[0-9]{6}'
- 'PATMAT-[A-Z]{3}-[0-9]{5}'
redaction:
marker_format: label
include_metadata: true
dlp-filter:
blocked_terms:
- Project Helix
- claim set A
- invention disclosure draft
detect_patterns:
- 'PROTO-[A-Z]{2}[0-9]{4}'
- 'CLAIM-[0-9]{2}'
action: block
fuzzy_matching: true
max_distance: 1
data-routing-policy:
require_zero_data_retention: true
require_no_training: true
max_retention_days: 0
require_in_memory_only: true
sanitize_before_provider: true
tokenize_sensitive_fields: true
allow_internet_egress: false
local_only_processing: true
on_no_compliant_provider: block
log_provider_selection: true
citation-verifier:
require_sources: true
require_source_match: true
min_confidence: 0.8
min_groundedness: 0.8
rag_context:
verify_against_context: true
min_context_overlap: 0.7
output_action:
unverified_action: block
legal-privilege:
privilege_markers:
- attorney-client privilege
- privileged and confidential
- for legal review only
audit-logger: {}
This configuration does not replace patent counsel judgment. It narrows the lane so the assistant can help with search and comparison without becoming a leak path for your unpublished filing strategy. If the model cannot satisfy the provider requirements, the route fails closed. If it cannot ground its citations, the route blocks the response instead of delivering a convincing hallucination.
Results and impact
The immediate effect is lower exposure per search session. Researchers can still ask useful questions, but the route strips or blocks the identifiers and codenames that should never leave the team boundary. Provider filtering removes silent routing drift. Citation verification reduces the chance that a practitioner works from invented authority. Evidence packaging is also simpler because the team can use Reviewing Alerts and Evidence and Export Evidence for a Review to hand off a scoped record of what happened.
For IP leaders, that means a cleaner separation between acceptable AI acceleration and disclosure risk. For engineers, it means a route you can validate and operate. For counsel, it means novelty analysis stays reviewable instead of becoming an opaque side conversation that no one can reconstruct later.
Key takeaways
- Treat patent research as an input-confidentiality problem first and a search problem second.
- Use
pii-detectoranddlp-filterto minimize unpublished invention details before the provider sees them. - Use Data Routing Policy so non-compliant provider targets are excluded instead of merely discouraged.
- Use Citation Verifier to block polished but ungrounded prior art summaries.
- Keep evidence export operationally simple with Reviewing Alerts and Evidence and Export Evidence for a Review.