Platform Engineer Guide: Multi-Tenant AI Infrastructure

As a platform engineer, you build the infrastructure that enables multiple teams — or tenants — to consume AI services safely and independently. Keeptrusts provides the building blocks for multi-tenant AI platforms: per-tenant gateways, isolated configurations, resource allocation, and self-service onboarding workflows.

Use this page when

You are building a multi-tenant AI platform with per-tenant gateway isolation
You need to manage a fleet of gateways with team-scoped configurations and budgets
You are designing self-service onboarding workflows for internal teams
You want to implement tenant isolation at network, configuration, auth, cost, and data layers
You are automating gateway provisioning and configuration validation pipelines

Primary audience

Primary: Technical Engineers (Platform Engineers, Internal Platform Teams)
Secondary: DevOps Engineers, Cloud Architects, Engineering Managers

Multi-Tenant Architecture

Isolation Model

Keeptrusts supports tenant isolation at multiple layers:

Layer	Isolation mechanism	Managed via
Network	Separate gateway instances per tenant	Kubernetes namespaces, Docker networks
Configuration	Per-tenant policy configs	Console Configurations, Git-backed sync
Authentication	Tenant-scoped API keys and gateway keys	Console Settings > Access Keys / Gateway Keys
Cost	Per-tenant budget caps	Cost Center, wallet allocations
Data	Tenant-scoped event streams	Events API filtered by gateway

Gateway Fleet Topology

                    ┌─────────────────┐
                    │  Load Balancer   │
                    └────────┬────────┘
            ┌────────────────┼────────────────┐
            │                │                │
    ┌───────▼──────┐ ┌──────▼───────┐ ┌──────▼───────┐
    │  Gateway A   │ │  Gateway B   │ │  Gateway C   │
    │  Team: Eng   │ │  Team: Data  │ │  Team: R&D   │
    │  Policy: std │ │  Policy: reg │ │  Policy: exp │
    └──────┬───────┘ └──────┬───────┘ └──────┬───────┘
           │                │                │
           └────────────────┼────────────────┘
                    ┌───────▼───────┐
                    │ Control Plane │
                    │     API       │
                    └───────────────┘

Per-Tenant Configuration

Configuration Structure

Each tenant gets its own policy configuration:

policies:
- name: pii-protection
  type: pii_detection
  action: block
  enabled: true
- name: cost-cap
  type: cost_limit
  monthly_limit: 2000
  action: block
  enabled: true
- name: model-allowlist
  type: model_filter
  allowed_models:
  - gpt-4o
  - claude-sonnet-4-20250514
  enabled: true
providers:
  targets:
  - id: openai
    provider: 
    secret_key_ref:
      env: TEAM_ENG_OPENAI_KEY
  - id: anthropic
    provider: 
    secret_key_ref:
      env: TEAM_ENG_ANTHROPIC_KEY

Configuration Validation Pipeline

Before deploying any tenant configuration:

# Validate all tenant configs
for config in configs/teams/*.yaml; do
  echo "Validating $config..."
  kt policy lint --file "$config"
done

Git-Backed Configuration Management

Link tenant configurations to a Git repository through the Console:

Navigate to Settings > Git Repositories
Add the repository containing tenant configs
Map branches to environments (e.g., main → production, staging → staging)
Changes merged to the mapped branch automatically sync to the corresponding gateways

Self-Service Tenant Onboarding

Onboarding Workflow

Build a self-service onboarding pipeline:

Step 1: Provision tenant resources

# Create a namespace for the tenant (Kubernetes)
kubectl create namespace tenant-${TEAM_NAME}

# Deploy the gateway with tenant-specific config
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: gateway-config
  namespace: tenant-${TEAM_NAME}
data:
  policy-config.yaml: |
    $(cat configs/teams/${TEAM_NAME}.yaml)
EOF

Step 2: Generate tenant credentials

Use the API to create team-scoped tokens:

# Create a gateway key for the tenant
curl -X POST \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  https://api.keeptrusts.com/v1/tokens \
  -d '{
    "name": "team-'${TEAM_NAME}'-gateway",
    "token_type": "gateway",
    "team_id": "'${TEAM_ID}'"
  }'

Step 3: Verify tenant gateway

kt doctor
kt events list --since 5m --limit 1

Self-Service Portal Integration

Expose onboarding through your internal developer platform:

Team lead requests AI access through your portal
Portal triggers the provisioning pipeline
Gateway deploys with the appropriate policy template
Credentials are delivered securely to the team
The team is visible in the Console Dashboard within minutes

Resource Allocation

Cost Management per Tenant

Set wallet allocations and budget caps per team:

# Allocate credits to a team wallet
curl -X POST \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  https://api.keeptrusts.com/v1/wallets/allocate \
  -d '{"team_id": "'${TEAM_ID}'", "amount": 5000}'

# Check team wallet balance
curl -H "Authorization: Bearer $ADMIN_TOKEN" \
  "https://api.keeptrusts.com/v1/wallets/balance?team_id=${TEAM_ID}"

Rate Limiting

Apply per-tenant rate limits in the policy configuration:

policies:
  - name: tenant-rate-limit
    type: rate_limit
    max_requests_per_minute: 120
    action: block
    enabled: true

Quota Monitoring

Track tenant resource consumption through the Events API:

# Get usage metrics per gateway (tenant)
curl -H "Authorization: Bearer $API_TOKEN" \
  "https://api.keeptrusts.com/v1/events?since=30d&group_by=gateway"

The Console Cost Center provides a visual breakdown of spend by team.

Fleet Management

Rolling Updates

Update gateway configurations across the fleet without downtime:

# Validate the new configuration
kt policy lint --file configs/teams/updated-config.yaml

# With Git-backed sync, merge the config change and all gateways update automatically
# For manual deployment, update the ConfigMap and restart:
kubectl rollout restart deployment/keeptrusts-gateway -n tenant-${TEAM_NAME}

Fleet Health Monitoring

Monitor all gateways from a single pane:

# List all gateways reporting to the control plane
curl -H "Authorization: Bearer $API_TOKEN" \
  "https://api.keeptrusts.com/v1/gateways"

# Check events across all gateways
kt events list --since 1h --limit 20

In the Console, the Dashboard shows aggregate metrics across all gateways, and you can drill into individual gateway views.

Configuration Drift Detection

Detect when gateway configurations diverge from the source of truth:

# Compare running config against Git source
kt policy lint --file configs/teams/${TEAM_NAME}.yaml

With Git-backed sync enabled, drift is automatically corrected on the next sync cycle.

Platform Security

Tenant Isolation Verification

Verify that tenant boundaries are enforced:

Each tenant has unique gateway keys (Console Settings > Gateway Keys)
Events are scoped to the originating gateway
Cost Center shows per-tenant spend, not cross-tenant aggregates
Escalations are routed to the tenant's designated reviewers

Secret Management

Store provider API keys securely:

# Kubernetes secrets for tenant credentials
kubectl create secret generic keeptrusts-secrets \
  --namespace tenant-${TEAM_NAME} \
  --from-literal=openai-key=${OPENAI_KEY} \
  --from-literal=anthropic-key=${ANTHROPIC_KEY}

Keeptrusts encrypts secrets at rest using AES-GCM-SIV. Reference keys via environment variable names in your configuration rather than embedding them directly.

Success Metrics for Platform Engineers

Metric	Target	Source
Tenant onboarding time	Under 30 minutes automated	Pipeline metrics
Gateway fleet uptime	99.9% aggregate	Health check monitoring
Configuration drift incidents	Zero	configuration deployment verification
Cross-tenant data leakage	Zero	Security audit
Resource utilization efficiency	> 60% average	Infrastructure monitoring

Next steps

Deploy gateway fleet: DevOps Guide
Set up versioned configs: Configuration Management
Review security model: CISO Guide

For AI systems

Canonical terms: Keeptrusts, multi-tenant, tenant isolation, gateway fleet, self-service onboarding, resource allocation
Key surfaces: Console Configurations, Console Settings > Access Keys / Gateway Keys, Console Usage, Events API (filtered by gateway), Git-linked configuration sync
Commands: kt policy lint, kt gateway run, kt doctor
Isolation layers: Network (K8s namespaces), Configuration (per-tenant YAML), Authentication (tenant-scoped keys), Cost (wallet allocations), Data (tenant-scoped events)
Fleet topology: Load Balancer → Gateway per team → shared Control-Plane API
Best next pages: DevOps Guide, Configuration Management, CISO Guide

For engineers

Per-tenant config structure: separate YAML files per team with team-specific provider keys, model allowlists, and cost caps
Validate all configs in CI: for config in configs/teams/*.yaml; do kt policy lint --file "$config"; done
Deploy tenant gateways: kt gateway run --policy-config configs/teams/team-eng.yaml --port 41002
Gateway fleet health: run kt doctor across all instances; use Kubernetes liveness probes
Tenant-scoped event isolation: filter Events API by gateway_id to ensure no cross-tenant data leakage
Automate onboarding: template selection → config generation → key provisioning → gateway deployment (target: under 30 minutes)

For leaders

Multi-tenant architecture enables platform teams to offer AI-as-a-service internally with full isolation between consuming teams
Self-service onboarding through templates and automated pipelines reduces tenant onboarding time to under 30 minutes
Per-tenant cost caps and wallet allocations prevent any single team from exhausting the organization's AI budget
Gateway fleet management through centralized control plane provides aggregate visibility while maintaining tenant autonomy
Zero cross-tenant data leakage is enforced through network isolation, scoped authentication, and filtered event streams

Use this page when​

Primary audience​

Multi-Tenant Architecture​

Isolation Model​

Gateway Fleet Topology​

Per-Tenant Configuration​

Configuration Structure​

Configuration Validation Pipeline​

Git-Backed Configuration Management​

Self-Service Tenant Onboarding​

Onboarding Workflow​

Self-Service Portal Integration​

Resource Allocation​

Cost Management per Tenant​

Rate Limiting​

Quota Monitoring​

Fleet Management​

Rolling Updates​

Fleet Health Monitoring​

Configuration Drift Detection​

Platform Security​

Tenant Isolation Verification​

Secret Management​

Success Metrics for Platform Engineers​

Next steps​

For AI systems​

For engineers​

For leaders​