Skip to main content

Financial Services AI Cost Control: 60% Spend Reduction Case Study

Most financial-services AI overspend does not come from one bad contract. It comes from default behavior. Teams route every workflow to the most capable model, reuse the same prompt stack for low-risk and high-risk tasks, and let several business units spend from one invisible pool. The result is predictable: support summaries cost as much as exception reviews, analysts stop thinking about route design, and finance sees the bill only after the usage pattern is already entrenched.

Keeptrusts can materially change that pattern because cost control is not only a finance exercise. It is a routing and governance exercise. The same platform features that help with compliance also help reduce unnecessary spend: separating low-risk and high-risk routes, ordering providers intentionally, constraining which routes can reach more expensive models, and assigning budget ownership through Spend & Wallets. In a composite but realistic financial-services rollout, that combination is enough to cut model spend by roughly 60% without forcing teams back to manual work.

Use this page when

  • You are operating multiple financial-services assistants for support, fraud review, compliance drafting, or internal operations and model spend is rising too quickly.
  • You want a realistic case study that connects provider routing choices with budget ownership and policy design.
  • You want the rollout to align with Finance, Spend & Wallets, and Reduce AI Spend.

Primary audience

  • Primary: Technical Leaders
  • Secondary: Technical Engineers, AI Agents

The problem

Financial-services teams usually start by standardizing on one strong model because it reduces decision friction. That works for the first assistant. It fails at scale. A chatbot answering generic servicing questions does not need the same model profile as a regulated-report draft or a supervised review lane. If every request goes to the premium path, the cost curve becomes detached from business value almost immediately.

The second mistake is flat budgeting. When fraud operations, support, risk, and compliance all consume one shared spend pool, nobody sees the tradeoff between route design and spend. Teams optimize locally for convenience and the organization loses the ability to ask which assistant is worth its cost.

The third mistake is assuming cost control and governance compete with each other. In practice they reinforce each other. A route with Data Routing Policy, attributable users, and auditable events is also easier to tune for cost because you can see what it is doing, who is using it, and which provider path it chose. Strong governance is what makes spend optimization safe.

The solution

The composite pattern that delivered the 60% reduction is straightforward.

First, split routes by workflow value. High-volume, low-risk tasks such as service summaries and operations notes go to a lower-cost target first. Higher-risk tasks such as exception analysis, supervised review drafts, or complex regulatory text can still reach a stronger model, but only when the route needs it.

Second, keep provider eligibility explicit with data-routing-policy. That prevents low-cost optimization from quietly weakening retention or training requirements. Cheaper is not useful if it crosses the handling boundary.

Third, give every team or lane a budget owner through Spend & Wallets. Once fraud, support, and compliance each own their own spend visibility, prompt discipline and route tuning improve quickly.

Finally, use event evidence to find waste. If a route is hitting the stronger model for work that could have been handled by the cheaper path, the audit and event trail makes that visible.

Implementation

This example shows a simple ordered-routing pattern for a financial-services program that wants a cheaper default path and a more capable fallback path under the same compliance boundary.

pack:
name: financial-services-cost-optimized
version: 1.0.0
enabled: true

providers:
targets:
- id: low-cost-zdr
provider: openai
model: gpt-5.4-mini-mini
secret_key_ref:
env: OPENAI_API_KEY
data_policy:
zero_data_retention: true
training_opt_out: true
retention_days: 0
- id: high-capability-zdr
provider: openai
model: gpt-5.4-mini
secret_key_ref:
env: OPENAI_API_KEY
data_policy:
zero_data_retention: true
training_opt_out: true
retention_days: 0

routing:
strategy: ordered

policies:
chain:
- rbac
- pii-detector
- data-routing-policy
- audit-logger

policy:
rbac:
deny_if_missing:
- X-User-ID
- X-User-Role
- X-Cost-Center
require_auth: true

pii-detector:
action: redact
detect_patterns:
- 'ACCOUNT-[0-9]{8,12}'
- 'CASE-[0-9]{8}'
redaction:
marker_format: label
include_metadata: true

data-routing-policy:
require_zero_data_retention: true
require_no_training: true
max_retention_days: 0
on_no_compliant_provider: block

audit-logger: {}

The key point is that cost optimization happens through route design, not hidden application logic. The cheaper model is the default because the route says so, not because a developer remembered to use the right endpoint.

kt policy lint --file ./financial-services-cost-optimized.yaml
kt gateway run --policy-config ./financial-services-cost-optimized.yaml --port 41002
kt export create --format json --filter "policy=data-routing-policy,audit-logger"

That is enough to validate the policy surface and review the events that show how the route behaved.

Results and impact

In the composite rollout, three changes drove the 60% reduction over the first operating period.

  • High-volume support and operations work moved to the lower-cost target by default.
  • Teams stopped sharing one invisible budget because Spend & Wallets made ownership explicit.
  • Route and event evidence made it obvious which workflows were overusing the expensive path.

The important part is that controls did not weaken to achieve the savings. Both providers were still required to meet the same declared handling conditions through Data Routing Policy. The savings came from choosing the right model lane more often, not from loosening the compliance boundary.

Key takeaways

  • Financial-services AI cost control is mainly a routing problem, not just a procurement problem.
  • Use ordered provider routing for cheaper default paths only when the handling boundary stays constant.
  • Pair route design with Spend & Wallets so every business lane owns its consumption.
  • Use Data Routing Policy so spend optimization does not weaken data handling requirements.
  • Extend the program with Reduce AI Spend and Rate Limiting & Cost Control.

Next steps