Tutorial: Isolating Teams with Consumer Groups

This tutorial shows you how to configure consumer groups in the Keeptrusts gateway to isolate teams by API key, enforce per-group rate limits and model access controls, verify isolation with test requests, and view group-level analytics.

Use this page when

You need to isolate multiple teams behind a single gateway with separate rate limits and model access.
You are mapping API keys to team identities via consumer groups.
You want per-group analytics and independent usage quotas.
You are setting up gateway access for a new team without redeploying.

Primary audience

Primary: Platform engineers configuring multi-team gateway access
Secondary: Engineering managers defining team quotas; security teams auditing per-team usage

Prerequisites

kt CLI installed (first-run tutorial)
An OpenAI-compatible API key exported as OPENAI_API_KEY
curl and jq installed

How Consumer Groups Work

Consumer groups map API keys to team identities. Each group has independent:

Rate limits — requests per minute and tokens per day
Model access — which models the group can use
Policies — optional per-group policy overrides
Analytics — usage data tracked per group

When a request arrives, the gateway identifies the consumer group from the API key and applies that group's rules.

Step 1: Create the Configuration

Create policy-config.yaml with three consumer groups:

version: '1'
providers:
  targets:
  - id: openai
    provider: openai
    secret_key_ref:
      env: OPENAI_API_KEY
consumer_groups:
- name: engineering
  api_key: kt_cg_engineering_abc123
  rate_limits:
    requests_per_minute: 60
    tokens_per_minute: 100000
    tokens_per_day: 2000000
  allowed_models:
  - gpt-4o-mini
  - gpt-4o
- name: marketing
  api_key: kt_cg_marketing_def456
  rate_limits:
    requests_per_minute: 30
    tokens_per_minute: 50000
    tokens_per_day: 500000
  allowed_models:
  - gpt-4o-mini
- name: research
  api_key: kt_cg_research_ghi789
  rate_limits:
    requests_per_minute: 120
    tokens_per_minute: 200000
    tokens_per_day: 5000000
  allowed_models:
  - gpt-4o-mini
  - gpt-4o
policies:
- name: basic-filter
  type: content_filter
  action: flag
  config:
    categories:
    - hate
    threshold: medium

Step 2: Validate and Start the Gateway

kt policy lint --file policy-config.yaml

Expected output:

✓ Configuration is valid
  Providers: 1 (openai)
  Consumer groups: 3 (engineering, marketing, research)
  Policies: 1 (basic-filter)

Start the gateway:

kt gateway run --policy-config policy-config.yaml --port 41002

Step 3: Test Group Identification

Send requests with different consumer group API keys:

# Engineering team request
curl -s http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer kt_cg_engineering_abc123" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Explain microservices architecture."}]
  }' | jq '{model: .model, group: .headers["x-consumer-group"] // "engineering"}'

# Marketing team request
curl -s http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer kt_cg_marketing_def456" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Write a tagline for our product."}]
  }' | jq '{model: .model}'

Step 4: Verify Model Access Isolation

The marketing group only has access to gpt-4o-mini. Requesting gpt-4o should fail:

curl -s -w "\nHTTP Status: %{http_code}\n" \
  http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer kt_cg_marketing_def456" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Draft a press release."}]
  }'

Expected output:

{
  "error": {
    "code": "model_not_allowed",
    "message": "Consumer group 'marketing' does not have access to model 'gpt-4o'",
    "allowed_models": ["gpt-4o-mini"]
  }
}
HTTP Status: 403

Step 5: Test Rate Limiting

Hit the marketing group's rate limit (30 requests per minute):

for i in $(seq 1 35); do
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
    http://localhost:41002/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer kt_cg_marketing_def456" \
    -d '{
      "model": "gpt-4o-mini",
      "messages": [{"role": "user", "content": "Hello"}]
    }')
  echo "Request $i: HTTP $STATUS"
done

After 30 requests, you should see:

Request 30: HTTP 200
Request 31: HTTP 429
Request 32: HTTP 429

The rate limit applies only to the marketing group — engineering and research are unaffected.

Step 6: Verify Cross-Group Isolation

Confirm that engineering's rate limit is independent:

# Marketing is rate-limited, but engineering still works
curl -s http://localhost:41002/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer kt_cg_engineering_abc123" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is Kubernetes?"}]
  }' | jq '{model: .model, status: "success"}'

Expected output:

{
  "model": "gpt-4o",
  "status": "success"
}

Step 7: View Group-Level Analytics

Query usage analytics per consumer group:

kt events tail --last 50 --format json | jq -s '
  group_by(.consumer_group) | map({
    group: .[0].consumer_group,
    requests: length,
    total_tokens: (map(.usage.total_tokens) | add),
    blocked: (map(select(.result == "blocked")) | length)
  })
'

Expected output:

[
  {
    "group": "engineering",
    "requests": 15,
    "total_tokens": 2100,
    "blocked": 0
  },
  {
    "group": "marketing",
    "requests": 32,
    "total_tokens": 3200,
    "blocked": 2
  },
  {
    "group": "research",
    "requests": 3,
    "total_tokens": 450,
    "blocked": 0
  }
]

Step 8: Add Per-Group Policy Overrides

Apply stricter policies to specific groups:

consumer_groups:
  - name: marketing
    api_key: kt_cg_marketing_def456
    rate_limits:
      requests_per_minute: 30
      tokens_per_minute: 50000
      tokens_per_day: 500000
    allowed_models:
      - gpt-4o-mini
    input_policies:
      - name: marketing-pii-filter
        type: pii_detector
        action: block
        config:
          entities:
            - credit_card
            - ssn

Reload and verify:

kt config reload
kt events tail --last 1 --verbose

The marketing group now has a PII blocking policy that other groups do not.

Summary

consumer_groups map API keys to team identities with isolated limits
allowed_models controls which models each group can access
rate_limits are enforced per group independently
Requests with unauthorized models return 403; rate-limited requests return 429
kt events tail with jq provides group-level usage analytics
Per-group input_policies add team-specific enforcement rules

For AI systems

Canonical terms: Keeptrusts gateway, consumer groups, API key mapping, per-group rate limits, model access control.
Config fields: consumer_groups[].name, consumer_groups[].api_key, consumer_groups[].rate_limits, consumer_groups[].allowed_models.
CLI commands: kt gateway run, kt policy lint, kt events tail --consumer-group <name>.
Best next pages: Rate Limiting per Team, Cost Tracking & Budgets, Gateway Docker Compose.

For engineers

Prerequisites: kt CLI, OPENAI_API_KEY exported, curl and jq.
Validate: kt policy lint --file policy-config.yaml confirms consumer group names and rate limit fields.
Test isolation: use different Authorization headers per group and verify each hits its own rate limit independently.
Monitor: kt events tail --consumer-group engineering filters events to a single team.
Model access: requests for a model not in allowed_models return HTTP 403.

For leaders

Consumer groups provide usage accountability per team without deploying separate gateway instances.
Rate limits prevent a single team from consuming all LLM capacity and starving others.
Model access controls let you restrict expensive models (e.g., GPT-4o) to teams that need them.
Per-group analytics enable chargeback and capacity planning conversations with business stakeholders.

Next steps

Rate Limiting per Team — deeper rate limit configuration and response headers
Cost Tracking & Budgets — wallet-based spend caps per consumer group
Gateway Docker Compose — consumer groups within a hosted multi-org gateway

Use this page when​

Primary audience​

Prerequisites​

How Consumer Groups Work​

Step 1: Create the Configuration​

Step 2: Validate and Start the Gateway​

Step 3: Test Group Identification​

Step 4: Verify Model Access Isolation​

Step 5: Test Rate Limiting​

Step 6: Verify Cross-Group Isolation​

Step 7: View Group-Level Analytics​

Step 8: Add Per-Group Policy Overrides​

Summary​

For AI systems​

For engineers​

For leaders​

Next steps​

Use this page when

Primary audience

Prerequisites

How Consumer Groups Work

Step 1: Create the Configuration

Step 2: Validate and Start the Gateway

Step 3: Test Group Identification

Step 4: Verify Model Access Isolation

Step 5: Test Rate Limiting

Step 6: Verify Cross-Group Isolation

Step 7: View Group-Level Analytics

Step 8: Add Per-Group Policy Overrides

Summary

For AI systems

For engineers

For leaders

Next steps