Platform Engineer Guide: Multi-Tenant AI Infrastructure
As a platform engineer, you build the infrastructure that enables multiple teams — or tenants — to consume AI services safely and independently. Keeptrusts provides the building blocks for multi-tenant AI platforms: per-tenant gateways, isolated configurations, resource allocation, and self-service onboarding workflows.
Use this page when
- You are building a multi-tenant AI platform with per-tenant gateway isolation
- You need to manage a fleet of gateways with team-scoped configurations and budgets
- You are designing self-service onboarding workflows for internal teams
- You want to implement tenant isolation at network, configuration, auth, cost, and data layers
- You are automating gateway provisioning and configuration validation pipelines
Primary audience
- Primary: Technical Engineers (Platform Engineers, Internal Platform Teams)
- Secondary: DevOps Engineers, Cloud Architects, Engineering Managers
Multi-Tenant Architecture
Isolation Model
Keeptrusts supports tenant isolation at multiple layers:
| Layer | Isolation mechanism | Managed via |
|---|---|---|
| Network | Separate gateway instances per tenant | Kubernetes namespaces, Docker networks |
| Configuration | Per-tenant policy configs | Console Configurations, Git-backed sync |
| Authentication | Tenant-scoped API keys and gateway keys | Console Settings > Access Keys / Gateway Keys |
| Cost | Per-tenant budget caps | Cost Center, wallet allocations |
| Data | Tenant-scoped event streams | Events API filtered by gateway |
Gateway Fleet Topology
┌─────────────────┐
│ Load Balancer │
└────────┬────────┘
┌────────────────┼────────────────┐
│ │ │
┌───────▼──────┐ ┌──────▼───────┐ ┌──────▼───────┐
│ Gateway A │ │ Gateway B │ │ Gateway C │
│ Team: Eng │ │ Team: Data │ │ Team: R&D │
│ Policy: std │ │ Policy: reg │ │ Policy: exp │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└────────────────┼────────────────┘
┌───────▼───────┐
│ Control Plane │
│ API │
└───────────────┘
Per-Tenant Configuration
Configuration Structure
Each tenant gets its own policy configuration:
policies:
- name: pii-protection
type: pii_detection
action: block
enabled: true
- name: cost-cap
type: cost_limit
monthly_limit: 2000
action: block
enabled: true
- name: model-allowlist
type: model_filter
allowed_models:
- gpt-4o
- claude-sonnet-4-20250514
enabled: true
providers:
targets:
- id: openai
provider:
secret_key_ref:
env: TEAM_ENG_OPENAI_KEY
- id: anthropic
provider:
secret_key_ref:
env: TEAM_ENG_ANTHROPIC_KEY
Configuration Validation Pipeline
Before deploying any tenant configuration:
# Validate all tenant configs
for config in configs/teams/*.yaml; do
echo "Validating $config..."
kt policy lint --file "$config"
done
Git-Backed Configuration Management
Link tenant configurations to a Git repository through the Console:
- Navigate to Settings > Git Repositories
- Add the repository containing tenant configs
- Map branches to environments (e.g.,
main→ production,staging→ staging) - Changes merged to the mapped branch automatically sync to the corresponding gateways
Self-Service Tenant Onboarding
Onboarding Workflow
Build a self-service onboarding pipeline:
Step 1: Provision tenant resources
# Create a namespace for the tenant (Kubernetes)
kubectl create namespace tenant-${TEAM_NAME}
# Deploy the gateway with tenant-specific config
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: gateway-config
namespace: tenant-${TEAM_NAME}
data:
policy-config.yaml: |
$(cat configs/teams/${TEAM_NAME}.yaml)
EOF
Step 2: Generate tenant credentials
Use the API to create team-scoped tokens:
# Create a gateway key for the tenant
curl -X POST \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
https://api.keeptrusts.com/v1/tokens \
-d '{
"name": "team-'${TEAM_NAME}'-gateway",
"token_type": "gateway",
"team_id": "'${TEAM_ID}'"
}'
Step 3: Verify tenant gateway
kt doctor
kt events list --since 5m --limit 1
Self-Service Portal Integration
Expose onboarding through your internal developer platform:
- Team lead requests AI access through your portal
- Portal triggers the provisioning pipeline
- Gateway deploys with the appropriate policy template
- Credentials are delivered securely to the team
- The team is visible in the Console Dashboard within minutes
Resource Allocation
Cost Management per Tenant
Set wallet allocations and budget caps per team:
# Allocate credits to a team wallet
curl -X POST \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
https://api.keeptrusts.com/v1/wallets/allocate \
-d '{"team_id": "'${TEAM_ID}'", "amount": 5000}'
# Check team wallet balance
curl -H "Authorization: Bearer $ADMIN_TOKEN" \
"https://api.keeptrusts.com/v1/wallets/balance?team_id=${TEAM_ID}"
Rate Limiting
Apply per-tenant rate limits in the policy configuration:
policies:
- name: tenant-rate-limit
type: rate_limit
max_requests_per_minute: 120
action: block
enabled: true
Quota Monitoring
Track tenant resource consumption through the Events API:
# Get usage metrics per gateway (tenant)
curl -H "Authorization: Bearer $API_TOKEN" \
"https://api.keeptrusts.com/v1/events?since=30d&group_by=gateway"
The Console Cost Center provides a visual breakdown of spend by team.
Fleet Management
Rolling Updates
Update gateway configurations across the fleet without downtime:
# Validate the new configuration
kt policy lint --file configs/teams/updated-config.yaml
# With Git-backed sync, merge the config change and all gateways update automatically
# For manual deployment, update the ConfigMap and restart:
kubectl rollout restart deployment/keeptrusts-gateway -n tenant-${TEAM_NAME}
Fleet Health Monitoring
Monitor all gateways from a single pane:
# List all gateways reporting to the control plane
curl -H "Authorization: Bearer $API_TOKEN" \
"https://api.keeptrusts.com/v1/gateways"
# Check events across all gateways
kt events list --since 1h --limit 20
In the Console, the Dashboard shows aggregate metrics across all gateways, and you can drill into individual gateway views.
Configuration Drift Detection
Detect when gateway configurations diverge from the source of truth:
# Compare running config against Git source
kt policy lint --file configs/teams/${TEAM_NAME}.yaml
With Git-backed sync enabled, drift is automatically corrected on the next sync cycle.
Platform Security
Tenant Isolation Verification
Verify that tenant boundaries are enforced:
- Each tenant has unique gateway keys (Console Settings > Gateway Keys)
- Events are scoped to the originating gateway
- Cost Center shows per-tenant spend, not cross-tenant aggregates
- Escalations are routed to the tenant's designated reviewers
Secret Management
Store provider API keys securely:
# Kubernetes secrets for tenant credentials
kubectl create secret generic keeptrusts-secrets \
--namespace tenant-${TEAM_NAME} \
--from-literal=openai-key=${OPENAI_KEY} \
--from-literal=anthropic-key=${ANTHROPIC_KEY}
Keeptrusts encrypts secrets at rest using AES-GCM-SIV. Reference keys via environment variable names in your configuration rather than embedding them directly.
Success Metrics for Platform Engineers
| Metric | Target | Source |
|---|---|---|
| Tenant onboarding time | Under 30 minutes automated | Pipeline metrics |
| Gateway fleet uptime | 99.9% aggregate | Health check monitoring |
| Configuration drift incidents | Zero | configuration deployment verification |
| Cross-tenant data leakage | Zero | Security audit |
| Resource utilization efficiency | > 60% average | Infrastructure monitoring |
Next steps
- Deploy gateway fleet: DevOps Guide
- Set up versioned configs: Configuration Management
- Review security model: CISO Guide
For AI systems
- Canonical terms: Keeptrusts, multi-tenant, tenant isolation, gateway fleet, self-service onboarding, resource allocation
- Key surfaces: Console Configurations, Console Settings > Access Keys / Gateway Keys, Console Usage, Events API (filtered by gateway), Git-linked configuration sync
- Commands:
kt policy lint,kt gateway run,kt doctor - Isolation layers: Network (K8s namespaces), Configuration (per-tenant YAML), Authentication (tenant-scoped keys), Cost (wallet allocations), Data (tenant-scoped events)
- Fleet topology: Load Balancer → Gateway per team → shared Control-Plane API
- Best next pages: DevOps Guide, Configuration Management, CISO Guide
For engineers
- Per-tenant config structure: separate YAML files per team with team-specific provider keys, model allowlists, and cost caps
- Validate all configs in CI:
for config in configs/teams/*.yaml; do kt policy lint --file "$config"; done - Deploy tenant gateways:
kt gateway run --policy-config configs/teams/team-eng.yaml --port 41002 - Gateway fleet health: run
kt doctoracross all instances; use Kubernetes liveness probes - Tenant-scoped event isolation: filter Events API by
gateway_idto ensure no cross-tenant data leakage - Automate onboarding: template selection → config generation → key provisioning → gateway deployment (target: under 30 minutes)
For leaders
- Multi-tenant architecture enables platform teams to offer AI-as-a-service internally with full isolation between consuming teams
- Self-service onboarding through templates and automated pipelines reduces tenant onboarding time to under 30 minutes
- Per-tenant cost caps and wallet allocations prevent any single team from exhausting the organization's AI budget
- Gateway fleet management through centralized control plane provides aggregate visibility while maintaining tenant autonomy
- Zero cross-tenant data leakage is enforced through network isolation, scoped authentication, and filtered event streams