Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Zero-Downtime Upgrade Procedures

Upgrading AI governance infrastructure requires coordination across stateless gateways, a stateful API, and frontend consoles. This guide covers procedures for upgrading each component without service interruption.

Use this page when

  • You are upgrading gateway, API, or console components without service interruption
  • You need to understand version compatibility, upgrade order, and rollback procedures
  • You want to implement rolling updates, blue-green console deployment, or canary gateway releases

Primary audience

  • Primary: Technical Engineers
  • Secondary: AI Agents, Technical Leaders

Version Compatibility Matrix

Keeptrusts components follow semantic versioning. Adjacent minor versions are always compatible:

API VersionGateway VersionsConsole VersionsNotes
1.5.x1.4.x – 1.5.x1.4.x – 1.5.xCurrent
1.4.x1.3.x – 1.4.x1.3.x – 1.4.xSupported
1.3.x1.2.x – 1.3.x1.2.x – 1.3.xEnd of life

Upgrade order: API first, then gateways, then console. The API is backward-compatible with the previous minor version of gateways and consoles.

Rolling Gateway Updates

Gateways are stateless — rolling updates are straightforward. The key constraint is maintaining policy enforcement continuity.

Kubernetes Rolling Update

apiVersion: apps/v1
kind: Deployment
metadata:
name: keeptrusts-gateway
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
spec:
containers:
- name: gateway
image: keeptrusts/gateway:1.5.0
readinessProbe:
httpGet:
path: /readyz
port: 41002
initialDelaySeconds: 10
periodSeconds: 5
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 10"]

The preStop hook ensures in-flight requests complete before the pod is terminated. The readiness probe prevents traffic to pods still loading policy configuration.

Update Procedure

  1. Update the image tag in your deployment manifest
  2. Apply the update:
    kubectl set image deployment/keeptrusts-gateway \
    gateway=keeptrusts/gateway:1.5.0 \
    -n keeptrusts
  3. Monitor the rollout:
    kubectl rollout status deployment/keeptrusts-gateway -n keeptrusts
  4. Verify policy enforcement:
    # Send a test request that should be blocked
    curl -X POST https://gateway.example.com/v1/chat/completions \
    -H "Authorization: Bearer $TEST_KEY" \
    -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "test blocked content"}]}'
    # Expect 409 if policy is active

Rollback

kubectl rollout undo deployment/keeptrusts-gateway -n keeptrusts

API Migration Auto-Apply

The API server automatically applies pending database migrations at startup. This enables zero-downtime upgrades when migrations are backward-compatible.

Safe Migration Patterns

Migrations that are safe for zero-downtime deployment:

  • Add a column with a default — existing code ignores the new column
  • Add a new table — no impact on existing queries
  • Add an index concurrently — use CREATE INDEX CONCURRENTLY
  • Add a new enum value — existing code handles unknown values

Migrations that require coordination:

  • Drop a column — deploy code that stops reading the column first
  • Rename a column — use a two-phase approach (add new, migrate, drop old)
  • Change a column type — add new column, backfill, switch reads, drop old

Upgrade Procedure

  1. Deploy the new API version alongside the existing one:
    kubectl set image deployment/keeptrusts-api \
    api=keeptrusts/api:1.5.0 \
    -n keeptrusts
  2. The first pod to start applies migrations — subsequent pods wait
  3. Monitor migration status:
    kubectl logs -l app=keeptrusts-api -n keeptrusts | grep "migration"
  4. Verify the API health:
    curl https://api.example.com/readyz

Migration Rollback

Migrations are forward-only. If a migration causes issues:

  1. Deploy the previous API version — it will work with the new schema if migrations are backward-compatible
  2. Create a corrective migration to undo the problematic change
  3. Never modify a shipped migration file

Console Blue-Green Deployment

The console is a stateless Next.js application. Blue-green deployment provides instant rollback capability.

Procedure

  1. Build the new console version:

    docker build -t keeptrusts/console:1.5.0 -f console/Dockerfile .
  2. Deploy to the green environment:

    kubectl set image deployment/keeptrusts-console-green \
    console=keeptrusts/console:1.5.0 \
    -n keeptrusts
  3. Verify the green environment:

    # Internal health check
    curl https://console-green.internal.example.com/api/health

    # Smoke test critical pages
    curl -s -o /dev/null -w "%{http_code}" \
    https://console-green.internal.example.com/dashboard
  4. Switch traffic from blue to green:

    kubectl patch service keeptrusts-console \
    -p '{"spec":{"selector":{"version":"green"}}}' \
    -n keeptrusts
  5. Rollback if needed:

    kubectl patch service keeptrusts-console \
    -p '{"spec":{"selector":{"version":"blue"}}}' \
    -n keeptrusts

Environment Variables

Console builds bake NEXT_PUBLIC_* variables at build time. Ensure the green build uses the correct values:

docker build \
--build-arg NEXT_PUBLIC_API_URL=https://api.example.com \
--build-arg NEXT_PUBLIC_GATEWAY_URL=https://gateway.example.com \
-t keeptrusts/console:1.5.0 \
-f console/Dockerfile .

Worker Binary Updates

Worker binaries (worker_export, worker_lifecycle, worker_config) process background jobs. Update them after the API:

  1. Stop the current worker — it will finish its current job
  2. Deploy the new version — it picks up where the old one left off
  3. Verify job processing:
    curl https://api.example.com/v1/admin/workers/status \
    -H "Authorization: Bearer $ADMIN_TOKEN"

Workers are safe to restart at any time. In-flight jobs are retried on the next poll cycle.

Pre-Upgrade Checklist

  • Read the release notes for breaking changes
  • Verify version compatibility matrix
  • Back up the database (or verify continuous backup)
  • Test the upgrade in a staging environment
  • Notify teams of the maintenance window (if applicable)
  • Verify monitoring dashboards are accessible

Post-Upgrade Verification

  • All health endpoints return 200
  • Gateway policy enforcement is active (test a blocked request)
  • Console login and navigation work
  • Event ingestion is flowing (check event count)
  • Export jobs are processing
  • No error spikes in monitoring dashboards

Coordinated Multi-Component Upgrade

For major version upgrades affecting all components:

1. API (with migrations)
├── Wait for all pods healthy
├── Verify /readyz returns 200
└── Check migration log

2. Gateways (rolling update)
├── Wait for rollout complete
├── Verify /readyz on all pods
└── Test policy enforcement

3. Console (blue-green switch)
├── Verify green environment
├── Switch traffic
└── Smoke test critical flows

4. Workers (restart)
└── Verify job processing resumes

Next steps

For AI systems

  • Canonical terms: rolling update, blue-green deployment, version compatibility matrix, upgrade order, maxUnavailable, maxSurge, preStop hook, readiness probe, database migration
  • Upgrade order: API first (with migrations), then gateways, then console
  • Version compatibility: adjacent minor versions always compatible (e.g., gateway 1.4.x works with API 1.5.x)
  • Health endpoints: /readyz (readiness), gateway port 41002
  • Related pages: Multi-Region, Disaster Recovery, Monitoring & Alerting

For engineers

  • Always upgrade API first — migrations auto-apply at startup and are backward-compatible with previous-minor gateways
  • Use kubectl set image deployment/keeptrusts-gateway gateway=keeptrusts/gateway:<version> for rolling gateway updates
  • Add preStop: sleep 10 to ensure in-flight requests complete before pod termination
  • Monitor rollout with kubectl rollout status deployment/keeptrusts-gateway -n keeptrusts
  • For console, deploy the new version to a green environment, verify with smoke tests, then switch traffic
  • Rollback: kubectl rollout undo deployment/keeptrusts-gateway -n keeptrusts
  • Validate: send a test request that should be blocked and confirm 409 response (policy enforcement active)

For leaders

  • Zero-downtime upgrades ensure AI traffic is never interrupted during platform maintenance
  • Version compatibility matrix (N-1 support) means you don't have to upgrade all components simultaneously
  • API-first upgrade order ensures database migrations are in place before gateways or consoles expect new schemas
  • Blue-green console deployment provides instant rollback if the new version has issues
  • Canary gateway releases (route 10% traffic to new version) reduce blast radius for gateway changes