Capacity Sizing & Resource Planning

Proper capacity planning ensures the Keeptrusts platform performs reliably under your expected load. This guide provides resource requirements per component, throughput benchmarks, scaling thresholds, and cloud instance recommendations.

Use this page when

You are planning CPU, memory, and storage requirements for a Keeptrusts deployment.
You need throughput benchmarks for the gateway under different policy chain complexities.
You are selecting cloud instance types or on-premises hardware for production.
You want scaling thresholds to determine when to add gateway instances or upgrade database resources.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Component Resource Profiles

Baseline Requirements

Component	Min CPU	Min Memory	Min Disk	Network
CLI Gateway	0.5 vCPU	256 MB	100 MB	Moderate
API Server	1 vCPU	512 MB	500 MB	Moderate
Console	0.5 vCPU	256 MB	200 MB	Low
Chat Workbench	0.5 vCPU	256 MB	200 MB	Low
PostgreSQL	2 vCPU	2 GB	20 GB+	Low
Worker (export)	0.5 vCPU	256 MB	1 GB+	Low
Worker (lifecycle)	0.25 vCPU	128 MB	100 MB	Low

Recommended Production Resources

Component	CPU	Memory	Disk	Notes
CLI Gateway	2 vCPU	512 MB	1 GB	Per instance; scale horizontally
API Server	4 vCPU	2 GB	10 GB	Single instance or HA pair
Console	1 vCPU	512 MB	1 GB	SSR workload, low steady-state
Chat Workbench	1 vCPU	512 MB	1 GB	Similar to console
PostgreSQL	4 vCPU	8 GB	100 GB SSD	IOPS-dependent on event volume
Worker (export)	2 vCPU	1 GB	10 GB	Spiky during exports

Throughput Benchmarks

Gateway Throughput

The gateway's throughput depends on policy chain complexity and upstream provider latency:

Scenario	Requests/sec (per instance)	Avg Latency Overhead
Passthrough (no policies)	~5,000	< 1 ms
Simple content filter (1 policy)	~3,000	2–5 ms
Full policy chain (5 policies)	~1,000	10–20 ms
Policy chain + knowledge base	~500	20–50 ms

Total gateway throughput scales linearly with instances behind a load balancer.

API Event Ingest Throughput

Configuration	Events/sec	Notes
Single API, local Postgres	~2,000	Limited by DB write IOPS
Single API, SSD Postgres	~5,000	NVMe recommended
HA API (2 instances)	~8,000	Shared Postgres connection pool

Database Sizing

Event storage grows linearly with traffic:

Daily Events	Monthly Storage	Annual Storage	Recommended Disk
10,000	~500 MB	~6 GB	50 GB
100,000	~5 GB	~60 GB	200 GB
1,000,000	~50 GB	~600 GB	1 TB+

Factor in retention policy — KEEPTRUSTS_EVENT_RETENTION_HOURS automatically prunes old events.

Scaling Thresholds

When to Scale the Gateway

Metric	Threshold	Action
CPU utilization	> 70% sustained (5 min)	Add gateway instance
Request queue depth	> 100 pending	Add gateway instance
p99 latency (overhead)	> 100 ms	Add gateway instance or check policy complexity
Active connections	> 80% of configured max	Add gateway instance

When to Scale the API

Metric	Threshold	Action
CPU utilization	> 80% sustained (5 min)	Scale vertically or add HA instance
DB connection pool	> 80% utilization	Increase pool size or add read replica
Event ingest latency	> 500 ms p99	Scale database IOPS
Memory usage	> 85%	Scale vertically

When to Scale PostgreSQL

Metric	Threshold	Action
Disk usage	> 70%	Expand disk or adjust retention
Connection count	> 80% of max_connections	Increase limit or add PgBouncer
Replication lag	> 30 seconds	Scale replica resources
Query p99	> 200 ms	Add indexes, optimize queries, scale IOPS

Cloud Instance Sizing

AWS

Component	Instance Type	vCPU	Memory	Storage	Monthly Cost (est.)
Gateway	c6i.large	2	4 GB	20 GB gp3	~$65
API Server	m6i.xlarge	4	16 GB	50 GB gp3	~$150
PostgreSQL	db.r6g.xlarge (RDS)	4	32 GB	200 GB gp3	~$350
Console	t3.medium	2	4 GB	20 GB gp3	~$35

Azure

Component	VM Size	vCPU	Memory	Storage	Monthly Cost (est.)
Gateway	Standard_D2s_v5	2	8 GB	30 GB P10	~$75
API Server	Standard_D4s_v5	4	16 GB	64 GB P15	~$155
PostgreSQL	GP_Standard_D4ds_v5 (Flex)	4	16 GB	256 GB	~$300
Console	Standard_B2ms	2	8 GB	30 GB	~$60

GCP

Component	Machine Type	vCPU	Memory	Storage	Monthly Cost (est.)
Gateway	e2-standard-2	2	8 GB	20 GB pd-ssd	~$55
API Server	e2-standard-4	4	16 GB	50 GB pd-ssd	~$110
PostgreSQL	db-custom-4-16384 (Cloud SQL)	4	16 GB	200 GB SSD	~$320
Console	e2-medium	2	4 GB	20 GB pd-standard	~$35

Sizing by Organization Scale

Small (< 50 users, < 10K events/day)

Single VM or small Docker host:
  4 vCPU, 8 GB RAM, 100 GB SSD

  ┌─────────────────────────────────┐
  │  All components on one host     │
  │  Gateway + API + Console + DB   │
  └─────────────────────────────────┘

# docker-compose resource limits
services:
  keeptrusts-api:
    deploy:
      resources:
        limits: { cpus: '1.0', memory: 512M }
  keeptrusts-gateway:
    deploy:
      resources:
        limits: { cpus: '0.5', memory: 256M }
  postgres:
    deploy:
      resources:
        limits: { cpus: '2.0', memory: 4G }

Medium (50–500 users, 10K–100K events/day)

Separate hosts for compute and data:

  ┌────────────────┐  ┌────────────────┐
  │ Compute Host   │  │ Database Host  │
  │ 8 vCPU, 16 GB  │  │ 4 vCPU, 16 GB  │
  │ Gateway ×2     │  │ PostgreSQL     │
  │ API            │  │ 200 GB SSD     │
  │ Console        │  └────────────────┘
  └────────────────┘

Large (500+ users, 100K+ events/day)

Dedicated hosts per component:

  ┌──────────┐  ┌──────────┐  ┌──────────┐
  │ Gateway  │  │ Gateway  │  │ Gateway  │
  │ ×3       │  │ ×3       │  │ ×3       │
  └─────┬────┘  └─────┬────┘  └─────┬────┘
        └──────────────┼──────────────┘
                  ┌────▼────┐
                  │ API ×2  │  (HA pair)
                  └────┬────┘
                  ┌────▼────┐
                  │Postgres │  (primary + replica)
                  │ HA      │
                  └─────────┘

Capacity Planning Formula

Estimate gateway instances needed:

Instances = ceil( Peak_RPS / (RPS_per_instance × 0.7) )

Where 0.7 is the target utilization (70%) to leave headroom for spikes.

Example: 2,000 peak RPS with full policy chain (1,000 RPS/instance):

Instances = ceil( 2000 / (1000 × 0.7) ) = ceil(2.86) = 3

Performance Testing

Load Test the Gateway

# Using hey (HTTP load generator)
hey -n 10000 -c 100 -m POST \
  -H "Authorization: Bearer kt_gk_test" \
  -H "Content-Type: application/json" \
  -D request.json \
  http://gateway.internal:41002/v1/chat/completions

# Using wrk
wrk -t4 -c100 -d60s -s post.lua http://gateway.internal:41002/v1/chat/completions

Benchmark Database Write Performance

# pgbench — test write throughput
pgbench -h db.internal -U keeptrusts -d keeptrusts \
  -c 20 -j 4 -T 60 -P 5

# Results to watch:
# - TPS (transactions per second)
# - Average latency

Monitoring for Capacity Decisions

Use the metrics from the Monitoring Infrastructure guide to inform scaling decisions:

# Gateway CPU headroom
100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle", job="keeptrusts-gateway"}[5m])) * 100)

# Database disk growth rate (GB/day)
predict_linear(pg_database_size_bytes{datname="keeptrusts"}[7d], 86400) / 1073741824

# Event ingest rate trend
rate(keeptrusts_api_events_ingested_total[1h])

Next steps

Monitoring Infrastructure — implement monitoring to validate capacity decisions
Load Balancing — distribute traffic across scaled gateway instances
Docker Deployment — resource configuration in Docker Compose

For AI systems

Canonical terms: Keeptrusts capacity planning, gateway throughput, resource sizing, scaling thresholds, PostgreSQL IOPS, horizontal scaling, load testing.
Key config/commands: Gateway benchmarks (5,000 req/s passthrough, 1,000 req/s with 5-policy chain); baseline resources (gateway: 0.5 vCPU/256 MB min, 2 vCPU/512 MB recommended); PostgreSQL (2 vCPU/2 GB min, 4 vCPU/8 GB/100 GB SSD recommended); PromQL queries for capacity monitoring.
Best next pages: Monitoring Infrastructure, Load Balancing, Docker Deployment.

For engineers

Prerequisites: Understand your expected request volume, policy chain complexity, and event retention requirements.
Gateway scales horizontally — each instance handles ~1,000 req/s with a full 5-policy chain; add instances behind a load balancer for more throughput.
Validate with: run load tests using the provided benchmark script; monitor predict_linear(pg_database_size_bytes[7d], 86400) for disk growth forecasting; track gateway CPU headroom via Prometheus.
PostgreSQL IOPS is the primary bottleneck for high event volumes — use SSD storage and monitor write latency.

For leaders

Gateway is the cheapest component to scale (0.5 vCPU per instance) — horizontal scaling is cost-effective.
PostgreSQL is the primary infrastructure cost driver for high-volume deployments (IOPS, storage growth).
Plan storage growth based on event volume: ~1 KB per event × daily event count × retention days.
Over-provisioning the gateway by 2× baseline provides headroom for traffic spikes without latency degradation.

Use this page when​

Primary audience​

Component Resource Profiles​

Baseline Requirements​

Recommended Production Resources​

Throughput Benchmarks​

Gateway Throughput​

API Event Ingest Throughput​

Database Sizing​

Scaling Thresholds​

When to Scale the Gateway​

When to Scale the API​

When to Scale PostgreSQL​

Cloud Instance Sizing​

AWS​

Azure​

GCP​

Sizing by Organization Scale​

Small (< 50 users, < 10K events/day)​

Medium (50–500 users, 10K–100K events/day)​

Large (500+ users, 100K+ events/day)​

Capacity Planning Formula​

Performance Testing​

Load Test the Gateway​

Benchmark Database Write Performance​

Monitoring for Capacity Decisions​

Next steps​

For AI systems​

For engineers​

For leaders​