Skip to main content

Vendor Data Sharing Agreements: What to Require from AI Providers

Vendor Data Sharing Agreements: What to Require from AI Providers

Most AI vendor agreements promise the right things in broad language: limited retention, no model training, secure handling, regional processing, and clear deletion behavior. The operational problem is that those promises are often disconnected from the system that actually routes requests. Keeptrusts closes that gap by letting you declare provider handling metadata on each target and then enforce those conditions before a model call is made.

Use this page when

  • You are negotiating or reviewing AI provider terms for retention, training, residency, or restricted-processing requirements.
  • You need a practical list of agreement terms that can be enforced at runtime.
  • You want procurement and engineering to share the same handling vocabulary.

Primary audience

  • Primary: Technical Leaders
  • Secondary: Technical Engineers, procurement, legal, and security reviewers

The problem

The contract and the runtime are often managed by different teams. Procurement negotiates retention or training clauses. Platform engineering configures provider targets. Application teams then route traffic based on whatever target is available. If those layers do not share the same handling metadata, the contract is a document rather than a control.

There is also a false sense of safety around fallback. Teams frequently assume their primary provider is compliant and their backup provider is "close enough." That assumption breaks the moment a request routes through the backup path during an outage or cost event. If the backup target lacks the same handling guarantees, the agreement is effectively bypassed by normal operations.

Finally, many teams ask for vague assurances instead of precise, enforceable commitments. "We handle data securely" is not as useful as a declaration that traffic is not retained, not used for training, processed in memory, safe for tokenized inputs, and not sent through internet egress when the workload requires stricter containment.

The solution

Write vendor requirements in the same categories Keeptrusts can enforce at runtime.

For each provider target, record whether it supports zero retention, training opt-out, a bounded retention window, in-memory handling, sanitized processing, tokenized inputs, internet-egress restrictions, and local-only processing. Then apply Data Routing Policy to remove any target that does not satisfy the agreed terms.

This is the core mindset change: a vendor agreement is not complete until the gateway can express the same requirement set as machine-readable target metadata. Once you do that, availability and cost logic continue to work, but only inside a provider pool that already satisfies the contract.

Implementation

The following configuration shows how to encode vendor requirements so the gateway can enforce them before routing:

pack:
name: vendor-data-sharing-enforcement
version: "1.0.0"
enabled: true

providers:
targets:
- id: contracted-zdr
provider: openai
model: gpt-5.4-mini-mini
secret_key_ref:
env: OPENAI_API_KEY
data_policy:
zero_data_retention: true
training_opt_out: true
retention_days: 0
in_memory_only: true
sanitized: true
accepts_tokenized_input: true
allow_internet_egress: false
local_only_processing: true

- id: standard-cloud
provider: openai
model: gpt-5.4-mini
secret_key_ref:
env: OPENAI_API_KEY
data_policy:
zero_data_retention: false
training_opt_out: true
retention_days: 30
in_memory_only: false
sanitized: false
accepts_tokenized_input: false
allow_internet_egress: true
local_only_processing: false

policies:
chain:
- data-routing-policy
- audit-logger

policy:
data-routing-policy:
require_zero_data_retention: true
require_no_training: true
max_retention_days: 0
require_in_memory_only: true
sanitize_before_provider: true
tokenize_sensitive_fields: true
allow_internet_egress: false
local_only_processing: true
on_no_compliant_provider: block
log_provider_selection: true

audit-logger: {}

This config matters for two reasons. First, it makes the contract concrete: only the target that matches the vendor terms remains eligible. Second, it prevents the common failure mode where an outage or cost optimization silently routes the request into a weaker provider lane.

During rollout, use warn first if your provider inventory is incomplete. Once you trust the metadata, move to block. That transition is usually where engineering and procurement finally align, because both teams can now see whether the portfolio actually contains enough compliant capacity.

To support negotiations and later audits, pair the routing policy with kt events and kt export-jobs. The contract discussion becomes much more useful when you can show how often targets were excluded, which requests were blocked, and whether the remaining route still met the agreed constraints.

Results and impact

The immediate impact is that vendor promises become enforceable at request time. Teams stop relying on tribal knowledge about which provider is "safe" and instead let the gateway make that decision from declared metadata.

The second impact is cleaner procurement language. Legal and vendor-management teams can ask for exactly the fields the platform enforces: zero retention, no training, retention days, tokenized-input support, and local-only handling where needed.

The third impact is better evidence. If a regulator, customer, or internal reviewer asks how provider restrictions are enforced, the answer is not a PDF in a contract repository. It is the live routing policy plus the exported decision record.

Key takeaways

  • Vendor agreements should use terms that map directly to Data Routing Policy and providers.targets[].data_policy.
  • A fallback provider that lacks the same guarantees weakens the agreement even if the primary provider is compliant.
  • Routing policy is the runtime companion to the contract, not an optional extra.
  • Evidence from kt events and kt export-jobs makes provider-governance claims easier to defend.
  • Zero Retention Endpoints and Data Residency Guide are useful references when defining the provider requirement set.

Next steps