Patent Research AI: Protecting Invention Details During Prior Art Search

Patent research is one of the highest-value legal AI workflows because it turns large patent and literature sets into something a practitioner can triage quickly. It is also one of the easiest ways to leak the very thing you are trying to protect. Teams paste unpublished invention summaries, draft claim language, prototype codenames, inventor details, and outside-counsel notes into a general-purpose assistant, then ask for novelty analysis or prior art comparisons. That gives the model enough context to help, but it also expands the exposure surface for confidential invention strategy.

Keeptrusts gives patent teams a better pattern: minimize invention identifiers before the provider sees them, route only to approved targets with strict declared handling guarantees, require grounded citations in the response, and preserve reviewable evidence when counsel needs to show how the search was conducted. The important detail is that no single policy solves this alone. Input protection comes from pii-detector, dlp-filter, and Data Routing Policy. Output quality comes from Citation Verifier. A narrow legal backstop comes from Legal Privilege when the assistant starts echoing privileged markers.

Use this page when

You are deploying AI for prior art search, patentability triage, invention disclosure review, or claim-chart drafting.
You need to protect unpublished invention details while still allowing researchers and counsel to use AI for search acceleration.
You want a practical pattern that connects Legal, Legal Technology, Data Routing Policy, and Citation Verifier.

Primary audience

Primary: Technical Leaders
Secondary: Technical Engineers, AI Agents

The problem

Prior art search looks harmless on the surface because the output is usually external material: existing patents, papers, standards, and public technical disclosures. The risky part is the prompt. Researchers usually do not ask the model a generic question such as "find prior art for a fluid valve." They paste the unpublished embodiment description, the working title from the invention disclosure, the feature list they think is novel, and often the draft claim they are trying to defend. Once that happens, the system is no longer only searching public art. It is also processing the applicant's confidential strategy.

That exposure creates three operational problems. The first is confidentiality. Prototype names, claim labels, project codes, and inventor details do not belong in an unconstrained upstream request. The second is quality. Patent teams do not just need plausible prior art. They need a response they can trace back to actual sources instead of a polished synthesis that mixes real references with invented citations. The third is reviewability. When a search result influences filing scope, claim amendments, or abandonment decisions, counsel eventually needs to reconstruct how the assistant was used.

Legal teams also need to be precise about what each control actually does. Legal Privilege is useful, but it is an output-phase marker check, not an inbound invention scrubber. It can stop the assistant from returning phrases such as "privileged and confidential" or "for legal review only," but it does not sanitize your prompt before the provider sees it. If you want to protect invention details during search, the primary controls have to sit earlier in the flow.

The solution

The practical pattern is to split patent research into a dedicated lane with aggressive input minimization and grounded output requirements. Start with pii-detector for inventor names, email addresses, reference numbers, and any structured identifiers that appear in invention disclosure templates. Then use dlp-filter for organization-specific content such as prototype codenames, internal claim labels, lab system identifiers, and matter numbers that only mean something inside your patent program.

After the prompt is minimized, use Data Routing Policy to restrict provider choice to targets that declare the retention and handling guarantees your legal team can accept. This is where teams make the common mistake of assuming a prompt rewrite is enough. It is not. The route itself should refuse non-compliant targets. For cross-border patent practice, this also supports jurisdiction-specific search lanes backed by the data-handling guarantees documented in Data Residency.

On the response side, use Citation Verifier so the assistant cannot hand back a tidy novelty narrative without tying it to real patents, publications, or supplied search context. That matters because patent teams often act on the first coherent explanation they receive. A grounded, source-matched result is much safer than an elegant but unverified one. Add legal-privilege as a narrow backstop for privileged markers, and keep audit-logger in the chain so the route clearly advertises that reviewable evidence is part of the control set.

Implementation

The following pack is a good starting point for a patent-search route used by prosecution counsel or an internal IP team. It protects invention identifiers on input, forces provider-side handling guarantees, and blocks ungrounded citation-heavy output.

pack:
  name: patent-research-zdr-lane
  version: 1.0.0
  enabled: true

providers:
  targets:
    - id: patent-search-zdr
      provider: openai
      model: gpt-5.4-mini-mini
      secret_key_ref:
        env: OPENAI_API_KEY
      data_policy:
        zero_data_retention: true
        training_opt_out: true
        retention_days: 0
        in_memory_only: true
        sanitized: true
        accepts_tokenized_input: true
        allow_internet_egress: false
        local_only_processing: true

policies:
  chain:
    - pii-detector
    - dlp-filter
    - data-routing-policy
    - citation-verifier
    - legal-privilege
    - audit-logger

policy:
  pii-detector:
    action: redact
    detect_patterns:
      - 'INV-[0-9]{6}'
      - 'PATMAT-[A-Z]{3}-[0-9]{5}'
    redaction:
      marker_format: label
      include_metadata: true

  dlp-filter:
    blocked_terms:
      - Project Helix
      - claim set A
      - invention disclosure draft
    detect_patterns:
      - 'PROTO-[A-Z]{2}[0-9]{4}'
      - 'CLAIM-[0-9]{2}'
    action: block
    fuzzy_matching: true
    max_distance: 1

  data-routing-policy:
    require_zero_data_retention: true
    require_no_training: true
    max_retention_days: 0
    require_in_memory_only: true
    sanitize_before_provider: true
    tokenize_sensitive_fields: true
    allow_internet_egress: false
    local_only_processing: true
    on_no_compliant_provider: block
    log_provider_selection: true

  citation-verifier:
    require_sources: true
    require_source_match: true
    min_confidence: 0.8
    min_groundedness: 0.8
    rag_context:
      verify_against_context: true
      min_context_overlap: 0.7
    output_action:
      unverified_action: block

  legal-privilege:
    privilege_markers:
      - attorney-client privilege
      - privileged and confidential
      - for legal review only

  audit-logger: {}

This configuration does not replace patent counsel judgment. It narrows the lane so the assistant can help with search and comparison without becoming a leak path for your unpublished filing strategy. If the model cannot satisfy the provider requirements, the route fails closed. If it cannot ground its citations, the route blocks the response instead of delivering a convincing hallucination.

Results and impact

The immediate effect is lower exposure per search session. Researchers can still ask useful questions, but the route strips or blocks the identifiers and codenames that should never leave the team boundary. Provider filtering removes silent routing drift. Citation verification reduces the chance that a practitioner works from invented authority. Evidence packaging is also simpler because the team can use Reviewing Alerts and Evidence and Export Evidence for a Review to hand off a scoped record of what happened.

For IP leaders, that means a cleaner separation between acceptable AI acceleration and disclosure risk. For engineers, it means a route you can validate and operate. For counsel, it means novelty analysis stays reviewable instead of becoming an opaque side conversation that no one can reconstruct later.

Key takeaways

Treat patent research as an input-confidentiality problem first and a search problem second.
Use pii-detector and dlp-filter to minimize unpublished invention details before the provider sees them.
Use Data Routing Policy so non-compliant provider targets are excluded instead of merely discouraged.
Use Citation Verifier to block polished but ungrounded prior art summaries.
Keep evidence export operationally simple with Reviewing Alerts and Evidence and Export Evidence for a Review.

Patent Research AI: Protecting Invention Details During Prior Art Search

Use this page when​

Primary audience​

The problem​

The solution​

Implementation​

Results and impact​

Key takeaways​

Next steps​