Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Capacity Sizing & Resource Planning

Proper capacity planning ensures the Keeptrusts platform performs reliably under your expected load. This guide provides resource requirements per component, throughput benchmarks, scaling thresholds, and cloud instance recommendations.

Use this page when

  • You are planning CPU, memory, and storage requirements for a Keeptrusts deployment.
  • You need throughput benchmarks for the gateway under different policy chain complexities.
  • You are selecting cloud instance types or on-premises hardware for production.
  • You want scaling thresholds to determine when to add gateway instances or upgrade database resources.

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Component Resource Profiles

Baseline Requirements

ComponentMin CPUMin MemoryMin DiskNetwork
CLI Gateway0.5 vCPU256 MB100 MBModerate
API Server1 vCPU512 MB500 MBModerate
Console0.5 vCPU256 MB200 MBLow
Chat Workbench0.5 vCPU256 MB200 MBLow
PostgreSQL2 vCPU2 GB20 GB+Low
Worker (export)0.5 vCPU256 MB1 GB+Low
Worker (lifecycle)0.25 vCPU128 MB100 MBLow
ComponentCPUMemoryDiskNotes
CLI Gateway2 vCPU512 MB1 GBPer instance; scale horizontally
API Server4 vCPU2 GB10 GBSingle instance or HA pair
Console1 vCPU512 MB1 GBSSR workload, low steady-state
Chat Workbench1 vCPU512 MB1 GBSimilar to console
PostgreSQL4 vCPU8 GB100 GB SSDIOPS-dependent on event volume
Worker (export)2 vCPU1 GB10 GBSpiky during exports

Throughput Benchmarks

Gateway Throughput

The gateway's throughput depends on policy chain complexity and upstream provider latency:

ScenarioRequests/sec (per instance)Avg Latency Overhead
Passthrough (no policies)~5,000< 1 ms
Simple content filter (1 policy)~3,0002–5 ms
Full policy chain (5 policies)~1,00010–20 ms
Policy chain + knowledge base~50020–50 ms

Total gateway throughput scales linearly with instances behind a load balancer.

API Event Ingest Throughput

ConfigurationEvents/secNotes
Single API, local Postgres~2,000Limited by DB write IOPS
Single API, SSD Postgres~5,000NVMe recommended
HA API (2 instances)~8,000Shared Postgres connection pool

Database Sizing

Event storage grows linearly with traffic:

Daily EventsMonthly StorageAnnual StorageRecommended Disk
10,000~500 MB~6 GB50 GB
100,000~5 GB~60 GB200 GB
1,000,000~50 GB~600 GB1 TB+

Factor in retention policy — KEEPTRUSTS_EVENT_RETENTION_HOURS automatically prunes old events.

Scaling Thresholds

When to Scale the Gateway

MetricThresholdAction
CPU utilization> 70% sustained (5 min)Add gateway instance
Request queue depth> 100 pendingAdd gateway instance
p99 latency (overhead)> 100 msAdd gateway instance or check policy complexity
Active connections> 80% of configured maxAdd gateway instance

When to Scale the API

MetricThresholdAction
CPU utilization> 80% sustained (5 min)Scale vertically or add HA instance
DB connection pool> 80% utilizationIncrease pool size or add read replica
Event ingest latency> 500 ms p99Scale database IOPS
Memory usage> 85%Scale vertically

When to Scale PostgreSQL

MetricThresholdAction
Disk usage> 70%Expand disk or adjust retention
Connection count> 80% of max_connectionsIncrease limit or add PgBouncer
Replication lag> 30 secondsScale replica resources
Query p99> 200 msAdd indexes, optimize queries, scale IOPS

Cloud Instance Sizing

AWS

ComponentInstance TypevCPUMemoryStorageMonthly Cost (est.)
Gatewayc6i.large24 GB20 GB gp3~$65
API Serverm6i.xlarge416 GB50 GB gp3~$150
PostgreSQLdb.r6g.xlarge (RDS)432 GB200 GB gp3~$350
Consolet3.medium24 GB20 GB gp3~$35

Azure

ComponentVM SizevCPUMemoryStorageMonthly Cost (est.)
GatewayStandard_D2s_v528 GB30 GB P10~$75
API ServerStandard_D4s_v5416 GB64 GB P15~$155
PostgreSQLGP_Standard_D4ds_v5 (Flex)416 GB256 GB~$300
ConsoleStandard_B2ms28 GB30 GB~$60

GCP

ComponentMachine TypevCPUMemoryStorageMonthly Cost (est.)
Gatewaye2-standard-228 GB20 GB pd-ssd~$55
API Servere2-standard-4416 GB50 GB pd-ssd~$110
PostgreSQLdb-custom-4-16384 (Cloud SQL)416 GB200 GB SSD~$320
Consolee2-medium24 GB20 GB pd-standard~$35

Sizing by Organization Scale

Small (< 50 users, < 10K events/day)

Single VM or small Docker host:
4 vCPU, 8 GB RAM, 100 GB SSD

┌─────────────────────────────────┐
│ All components on one host │
│ Gateway + API + Console + DB │
└─────────────────────────────────┘
# docker-compose resource limits
services:
keeptrusts-api:
deploy:
resources:
limits: { cpus: '1.0', memory: 512M }
keeptrusts-gateway:
deploy:
resources:
limits: { cpus: '0.5', memory: 256M }
postgres:
deploy:
resources:
limits: { cpus: '2.0', memory: 4G }

Medium (50–500 users, 10K–100K events/day)

Separate hosts for compute and data:

┌────────────────┐ ┌────────────────┐
│ Compute Host │ │ Database Host │
│ 8 vCPU, 16 GB │ │ 4 vCPU, 16 GB │
│ Gateway ×2 │ │ PostgreSQL │
│ API │ │ 200 GB SSD │
│ Console │ └────────────────┘
└────────────────┘

Large (500+ users, 100K+ events/day)

Dedicated hosts per component:

┌──────────┐ ┌──────────┐ ┌──────────┐
│ Gateway │ │ Gateway │ │ Gateway │
│ ×3 │ │ ×3 │ │ ×3 │
└─────┬────┘ └─────┬────┘ └─────┬────┘
└──────────────┼──────────────┘
┌────▼────┐
│ API ×2 │ (HA pair)
└────┬────┘
┌────▼────┐
│Postgres │ (primary + replica)
│ HA │
└─────────┘

Capacity Planning Formula

Estimate gateway instances needed:

Instances = ceil( Peak_RPS / (RPS_per_instance × 0.7) )

Where 0.7 is the target utilization (70%) to leave headroom for spikes.

Example: 2,000 peak RPS with full policy chain (1,000 RPS/instance):

Instances = ceil( 2000 / (1000 × 0.7) ) = ceil(2.86) = 3

Performance Testing

Load Test the Gateway

# Using hey (HTTP load generator)
hey -n 10000 -c 100 -m POST \
-H "Authorization: Bearer kt_gk_test" \
-H "Content-Type: application/json" \
-D request.json \
http://gateway.internal:41002/v1/chat/completions

# Using wrk
wrk -t4 -c100 -d60s -s post.lua http://gateway.internal:41002/v1/chat/completions

Benchmark Database Write Performance

# pgbench — test write throughput
pgbench -h db.internal -U keeptrusts -d keeptrusts \
-c 20 -j 4 -T 60 -P 5

# Results to watch:
# - TPS (transactions per second)
# - Average latency

Monitoring for Capacity Decisions

Use the metrics from the Monitoring Infrastructure guide to inform scaling decisions:

# Gateway CPU headroom
100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle", job="keeptrusts-gateway"}[5m])) * 100)

# Database disk growth rate (GB/day)
predict_linear(pg_database_size_bytes{datname="keeptrusts"}[7d], 86400) / 1073741824

# Event ingest rate trend
rate(keeptrusts_api_events_ingested_total[1h])

Next steps

For AI systems

  • Canonical terms: Keeptrusts capacity planning, gateway throughput, resource sizing, scaling thresholds, PostgreSQL IOPS, horizontal scaling, load testing.
  • Key config/commands: Gateway benchmarks (5,000 req/s passthrough, 1,000 req/s with 5-policy chain); baseline resources (gateway: 0.5 vCPU/256 MB min, 2 vCPU/512 MB recommended); PostgreSQL (2 vCPU/2 GB min, 4 vCPU/8 GB/100 GB SSD recommended); PromQL queries for capacity monitoring.
  • Best next pages: Monitoring Infrastructure, Load Balancing, Docker Deployment.

For engineers

  • Prerequisites: Understand your expected request volume, policy chain complexity, and event retention requirements.
  • Gateway scales horizontally — each instance handles ~1,000 req/s with a full 5-policy chain; add instances behind a load balancer for more throughput.
  • Validate with: run load tests using the provided benchmark script; monitor predict_linear(pg_database_size_bytes[7d], 86400) for disk growth forecasting; track gateway CPU headroom via Prometheus.
  • PostgreSQL IOPS is the primary bottleneck for high event volumes — use SSD storage and monitor write latency.

For leaders

  • Gateway is the cheapest component to scale (0.5 vCPU per instance) — horizontal scaling is cost-effective.
  • PostgreSQL is the primary infrastructure cost driver for high-volume deployments (IOPS, storage growth).
  • Plan storage growth based on event volume: ~1 KB per event × daily event count × retention days.
  • Over-provisioning the gateway by 2× baseline provides headroom for traffic spikes without latency degradation.