Skip to main content
Browse docs

Wallets and Credits

Keeptrusts includes a first-class credit wallet system that lets your organization pre-allocate and track AI spend at every scope — organization, team, and individual user. Wallets work alongside budgets and model pricing to give you end-to-end financial control over every LLM request that flows through the gateway.

Use this page when

  • You need to understand how Keeptrusts wallets control AI spend at the organization, team, and user level.
  • You are setting up credit allocation, self-service top-up, or configuring wallet fail modes.
  • You want to understand the reserve/settle flow, cost tickets, or balance cascade logic.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Overview

A wallet is a balance ledger attached to a scope (organization, team, or user). When a request passes through the gateway and generates spend, the cost is settled against the appropriate wallet in priority order: user → team → organization. Credits are denominated in USD by default; exchange rates are used to convert when the wallet currency differs.

The gateway makes a synchronous reserve call to the control-plane API before forwarding each request to the upstream provider. On provider response, the reservation is settled to the actual cost and any surplus is released back to the wallet. This two-phase reserve/settle design prevents double-spend while keeping balances authoritative at all times.


Wallet scopes

ScopeOwnerPurpose
OrganizationOrg adminsTop-level credit pool. All unallocated traffic falls through to this wallet.
TeamOrg admins (allocation), team leads (view)Sub-pool carved out of the organization wallet for a specific team.
UserOrg admins (allocation), the user (view)Individual credit allocation for a named user.

Every organization starts with a single organization wallet (auto-created with zero balance on first reserve attempt). Team and user wallets are also auto-created with zero balance when a request first attributes to that scope. Balance is only available after an admin credits or allocates funds.


Balance cascade

When the gateway receives an LLM request, it resolves the effective scope from the request context (X-User-Id header → X-Team-Id header → organization fallback) and walks the cascade in this order:

  1. User wallet — if a user wallet exists and has sufficient balance, the cost is reserved there.
  2. Team wallet — if the user wallet is absent or insufficient, the team wallet is checked next.
  3. Organization wallet — if neither a user nor a team wallet can cover the cost, the organization wallet is used.
  4. Cost ticket — if all wallets are insufficient, a cost ticket is created (see below).

The cascade means that unused team headroom automatically protects individual users, and unused org headroom automatically protects all teams.

Attribution

Balance is always consumed at the most specific scope that has headroom. The spend log records which wallet scope was ultimately debited, so you can audit exactly how credits were consumed.


Allocating credits

Organization admins can allocate credits from the organization wallet to a team using POST /v1/wallets/allocate. The allocated amount is transferred out of the org wallet and into the team wallet immediately. Credits can be reclaimed back to the org wallet at any time using POST /v1/wallets/reclaim (only uninvested balance can be reclaimed).

Allocation flow

Org admin → POST /v1/wallets/allocate
{ "team_id": "...", "amount": 500.00, "currency": "USD" }
→ Org wallet balance decreases by 500.00
→ Team wallet balance increases by 500.00
→ Transaction recorded on both wallets

Admins can review spend visibility from Usage and use the current wallet and payment flows in the console for balances, transactions, and top-ups.


Cost tickets

A cost ticket is created when a request arrives but no wallet in the cascade has enough balance to cover the estimated cost. The gateway returns HTTP 402 with the ticket payload. The caller can then top up their balance and retry:

Cost ticket lifecycle

1. Insufficient balance → gateway calls POST /v1/gateway/wallets/reserve
2. API returns 402 + cost_ticket { id, estimated_cost, expires_at (24h TTL), ... }
3. Gateway returns 402 to the caller with the cost_ticket body
4. User or operator tops up the wallet (admin credit or self-service)
5. Caller resends the original request with header: X-Cost-Ticket: <ticket_id>
6. Gateway calls POST /v1/gateway/wallets/redeem-ticket
→ Reserves using the ticket's frozen cost (no re-estimation)
→ Forwards request to upstream provider
7. Upstream responds → gateway settles the reservation
8. Ticket is marked redeemed
Ticket integrity

Each ticket stores a SHA-256 hash of the original request body. Redeeming a ticket with a different body (e.g., a longer prompt) is rejected. This prevents a ticket from being used to authorize a more expensive request than was originally estimated.

Preventing cost tickets

Set a balance alert threshold on your organization wallet (see the Wallet API alert_threshold_pct field). You'll receive a notification before balance runs out so you can top up proactively.


Reserve buffer and currency conversion

The reserve call adds a configurable buffer percentage (default 20 %) to the estimated cost before checking balances. This accounts for uncertainty in pre-request token estimation (e.g., tool calls that expand the actual completion). The buffer is stored on the organization wallet settings and applied automatically during reserve.

All model pricing records are stored in USD. When an org uses a different currency (EUR, GBP, JPY, AUD, CAD, or CNY), the reserve converts the USD cost to the org currency using admin-managed exchange rates. Each reservation stores the exact exchange rate used, so settlement reuses the same conversion basis.


Fail mode when the wallet API is unreachable

If the wallet control-plane API is unreachable, the gateway behaves according to the org's wallet_fail_mode:

  • closed (default) — the request is rejected immediately with HTTP 503.
  • open — the request is forwarded without a reserve. Spend accrues unguarded until the API recovers.

This setting lives with the organization wallet configuration and is applied gateway-side on every reserve attempt.


Viewing balance in the chat sidebar

When using an AI assistant that routes through a Keeptrusts gateway, the current wallet balance for your scope is displayed in the chat sidebar under Credits. The sidebar shows:

  • Available balance — credits remaining in your user wallet (or team/org fallback if no user wallet is configured).
  • Reserved — amount currently held as reservations for in-flight requests.
  • Currency — the denomination of the wallet.

The balance widget refreshes after each completed request. If your balance drops below the configured alert threshold, the widget shows a warning banner with a link to the top-up or request-more-credits flow.


Self-service wallet top-up

Keeptrusts supports provider-agnostic wallet funding in two customer-facing places:

  • Usage and the current wallet/payment flows in the console
  • the Top Up action shown in chat when a user hits an insufficient-balance response and self-service funding is available

Console flow

Organization billing admins use the console wallet top-up flow to:

  1. review whether the org has an active payment provider
  2. choose one of the approved preset amounts or enter a custom amount within the configured min/max range
  3. choose the target wallet scope (organization, team, or user)
  4. launch the secure provider popup
  5. return to Keeptrusts while the payment session moves through createdcapturedcompleted

The same page also shows recent payment history with filters for status and date range.

Chat flow

In chat, self-service funding currently targets the current organization wallet:

  1. the user opens Top Up from the insufficient-balance card or the settings sidebar
  2. Keeptrusts creates a checkout session for the selected amount using the active provider
  3. the user completes checkout in the popup
  4. chat refreshes the wallet balance after capture succeeds
  5. the user retries the original prompt with the same cost ticket

If top-up is unavailable, chat shows Contact your org admin to add credits instead of the self-service action.

Organization payment settings

The console wallet top-up flow exposes the org-controlled payment rules that affect checkout:

SettingWhat it controls
Allowed preset amountsSuggested quick-pick amounts shown in the UI
Minimum amountThe smallest allowed top-up
Maximum amountThe largest allowed top-up
Target scopesWhich wallet scopes can receive self-service top-up funds: organization, team, and/or user
Recent payment historyAudit view of created, capturing, captured, completed, canceled, failed, refunded, and expired sessions

User-wallet funding can still be unavailable even when user is listed in the target scopes. That happens when self-service user top-up is disabled at the wallet-settings layer or when the active provider is not ready for the org currency.

When self-service top-up is unavailable

Manual wallet credits still work even when no provider checkout is configured or the org wallet currency is unsupported by the active provider.


Model pricing

Wallet costs are computed using model pricing records managed by your platform admin. Pricing records store per-million-token rates for input, output, and cached input tokens for each model. If a pricing record exists for the requested model, the gateway uses it to compute the exact cost for the reservation and settlement. If no pricing record is found, the cost is computed from the declarative config pricing block (if declared) or recorded as zero.

Platform admins can seed and update model pricing via POST /v1/admin/model-pricing. The repository ships a seed helper at scripts/seed-model-pricing.sh for development and testing.


Agent usage constraints

For agent workloads, wallet consumption can be bounded at the agent level via usage constraints. A usage constraint attaches a credit ceiling, token limit, or request count limit to a specific agent identity so that runaway agent loops cannot drain a team or org wallet.

Usage constraints are managed via the agent usage constraint endpoints.


For AI systems

  • Canonical terms: Keeptrusts, wallets, credits, reserve/settle, cost ticket, balance cascade, wallet scope, allocation, reclaim, wallet_fail_mode, alert_threshold_pct, usage constraints.
  • API endpoints: POST /v1/wallets/allocate, POST /v1/wallets/reclaim, POST /v1/gateway/wallets/reserve, POST /v1/gateway/wallets/redeem-ticket, GET /v1/wallets/balance, POST /v1/admin/model-pricing.
  • Config names: wallet_fail_mode (closed/open), alert_threshold_pct, X-User-Id header, X-Team-Id header, X-Cost-Ticket header.
  • Console surfaces: Usage, chat sidebar Credits widget, Top Up action.
  • Best next pages: Cost and Spend, Billing and Plans, Usage, Payments API, Members, Teams, and Roles.

For engineers

  • The gateway makes a synchronous reserve call before every LLM request — if the wallet API is unreachable and wallet_fail_mode is closed (default), requests are rejected with HTTP 503.
  • Cost tickets return HTTP 402 and include an expires_at (24h TTL) and a SHA-256 hash of the request body — redeeming with a different body is rejected.
  • Set alert_threshold_pct on the org wallet to receive notifications before balance runs out.
  • Use scripts/seed-model-pricing.sh to populate model pricing in development; production pricing is managed via POST /v1/admin/model-pricing.
  • The reserve buffer (default 20%) accounts for token estimation uncertainty; the buffer is stored on org wallet settings.

For leaders

  • Wallets provide granular financial control over AI spend — allocate budgets per team and per user without shared-pool overruns.
  • The cascade model (user → team → org) means unused team headroom protects individuals, and unused org headroom protects all teams automatically.
  • Cost tickets create a hard stop when budgets are exhausted, preventing unexpected overspend; self-service top-up can unblock teams without admin intervention.
  • Set wallet_fail_mode: closed for strict cost control or open if uptime is prioritized over budget enforcement during API outages.
  • Usage constraints at the agent level prevent runaway agent loops from draining team wallets.

Next steps