Model Pricing Configuration: Accurate Cost Per Token Per Provider

Model pricing looks like bookkeeping until it fails. Then every downstream cost control starts drifting at the same time. Reserves are based on weak estimates, settlements are harder to explain, routing decisions compare the wrong rates, and finance loses confidence in monthly reports. In Keeptrusts, pricing data is not optional metadata. It is the cost catalog the gateway uses to estimate, reserve, settle, and report AI spend correctly by provider and model.

Use this page when

You are turning on wallets or spend tracking and need reliable per-token pricing.
You run the same model family across multiple providers and want the gateway to compare them accurately.
You need chargeback and monthly reporting data that finance will trust.

Primary audience

Primary: Technical Engineers
Secondary: FinOps owners, Technical Leaders

The problem

Cost governance becomes unreliable when pricing is incomplete, stale, or too generic.

The most common failure is missing provider specificity. Teams know they use something called gpt-5.4-mini-mini, but pricing differs by provider contract, deployment model, or negotiated rate. If the gateway treats everything as one generic model price, the estimate stops reflecting the real funding decision.

The second failure is fallback estimation. Keeptrusts documentation makes clear that when the model-pricing catalog lacks an exact match, the platform can fall back to a model family rate or a configurable default. That is useful for continuity, but it also lowers confidence. A low-confidence estimate is better than no estimate, yet it should not become the long-term operating model for budget enforcement.

The third failure is distorted optimization. Cost-aware routing only works if the gateway understands the real cost of each target. If premium and economy lanes are priced incorrectly, the routing policy is no longer optimizing spend. It is optimizing bad data.

The fourth failure is reporting drift. Cache avoided-cost numbers, reserve-and-settle reconciliation, team chargeback, and executive summaries all inherit the pricing catalog. If that catalog is wrong, every report looks precise while telling the wrong story.

The solution

Keeptrusts uses a model-pricing catalog as the source of truth for request-time cost estimation and post-response settlement. That means pricing records should be maintained with the same discipline as identity mappings or gateway policies.

Seed pricing for each model and provider combination you actually run in production. Verify it after load. Update it whenever the provider changes rates or you add a new model family. The point is not bureaucratic completeness. The point is operational trust.

Once pricing is accurate, three things improve immediately. Reserve and settle becomes far more predictable because the pre-dispatch estimate is closer to reality. Routing comparisons become meaningful because the gateway can distinguish cheap, moderate, and premium targets correctly. And spend reports stop collapsing multiple contracts into one blended guess.

Pricing accuracy also matters for cache economics. Keeptrusts documentation notes that avoided-cost calculations on cache hits use the same model-pricing table that live reserve-and-settle flows use. If pricing is correct, savings dashboards tell the truth. If pricing is stale, your ROI story is stale too.

Implementation

The fastest way to improve pricing accuracy is to seed the models you actually use and verify the catalog immediately.

export KEEPTRUSTS_API_URL="http://localhost:41002"
export KEEPTRUSTS_API_TOKEN="admin-token"

curl -s -X POST "$KEEPTRUSTS_API_URL/v1/model-pricing" \
  -H "Authorization: Bearer $KEEPTRUSTS_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4-mini-mini",
    "provider": "openai",
    "input_cost_per_1k_tokens": 0.00015,
    "output_cost_per_1k_tokens": 0.0006
  }' | jq .

curl -s "$KEEPTRUSTS_API_URL/v1/model-pricing" \
  -H "Authorization: Bearer $KEEPTRUSTS_API_TOKEN" | jq '.[] | {model, provider, input_cost_per_1k_tokens, output_cost_per_1k_tokens}'

That sequence does two things. It creates an explicit pricing record and then confirms what the gateway will use during spend estimation and reporting. In practice, teams should repeat this for every production model they route to, especially when those targets serve different departments or have different pricing tiers.

The rollout discipline is simple. Treat pricing updates as part of adding a new provider target. If you create a new target without matching pricing records, you have introduced a hidden quality issue into every downstream spend control.

Results and impact

Accurate pricing does not save money by itself. It enables the controls that do.

When the model-pricing catalog is correct, wallet reserves are closer to final cost, so teams are less likely to hit confusing discrepancies between estimate and settlement. Finance can trust that a team charged for gpt-5.4-mini-mini traffic on one provider is not being blended into a premium lane on another contract.

Routing also improves. If a platform owner is using cost-aware filters or comparing targets inside a model group, the gateway now evaluates real prices instead of rough assumptions. That prevents the subtle but expensive mistake of routing toward what looks cheap in config but is actually mispriced in the catalog.

At leadership level, this creates a cleaner story. Monthly spend reports, chargeback allocations, and cache savings numbers all align around one pricing source of truth. That makes budget meetings shorter because the argument shifts from data quality to actual optimization decisions.

Key takeaways

Model pricing is a runtime dependency for reserve, settle, routing, and reporting.
Missing or stale pricing pushes the platform into fallback estimates and lower-confidence cost control.
Accurate provider-specific records are necessary for clean chargeback and reliable cache savings numbers.
Pricing updates should be part of every new provider or model rollout.
Good spend governance depends on trustworthy cost data before it depends on good dashboards.

Model Pricing Configuration: Accurate Cost Per Token Per Provider

Use this page when​

Primary audience​

The problem​

The solution​

Implementation​

Results and impact​

Key takeaways​

Next steps​