IBM WatsonX
IBM WatsonX.ai provides foundation models — including IBM's own Granite series and third-party models like Llama and Mistral — on IBM Cloud infrastructure with enterprise governance and EU-hosted data residency options. Keeptrusts gateways WatsonX using IBM Cloud API key authentication, applying prompt-injection filtering, PII redaction, compliance auditing, and DLP controls before requests leave your environment. Client code uses the standard OpenAI format; Keeptrusts handles the WatsonX-specific API translation transparently.
Use this page when
- You need the exact command, config, API, or integration details for IBM WatsonX.
- You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
- If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Prerequisites
- An IBM Cloud account with WatsonX.ai provisioned
- An IBM Cloud API key (IAM) for a service ID with
Editoraccess to your WatsonX project - A WatsonX project ID visible in the WatsonX.ai console under Manage → General
- Keeptrusts CLI (
kt) installed and on yourPATH WATSONX_API_KEYand the project ID exported in your shell or injected via your secrets manager
Configuration
Minimal configuration
pack:
name: watsonx-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: watsonx-granite
provider: watsonx:chat:ibm/granite-3-3-8b-instruct
secret_key_ref:
env: WATSONX_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Full named configuration with policy chain
pack:
name: watsonx-enterprise
version: 1.0.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
- dlp-filter
- financial-compliance
- audit-logger
policy:
prompt-injection:
threshold: 0.8
action: block
pii-detector:
action: redact
entities:
- EMAIL
- PHONE
- SSN
- CREDIT_CARD
- ADDRESS
dlp-filter:
patterns:
- name: ibm-api-key
regex: IAM[A-Za-z0-9_-]{32,}
action: redact
- name: ibm-crn
regex: crn:v1:[a-z]+:[a-z]+:[a-z0-9-]+:[a-z0-9-]+:[a-zA-Z0-9/:-]+
action: redact
financial-compliance:
regulations:
- sox
action: audit
audit-logger:
retention_days: 2555
providers:
targets:
- id: watsonx-granite-8b
provider: watsonx:chat:ibm/granite-3-3-8b-instruct
base_url: https://us-south.ml.cloud.ibm.com/ml/v1
secret_key_ref:
env: WATSONX_API_KEY
Llama 3.3 70B on WatsonX (us-south)
pack:
name: watsonx-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: watsonx-llama-70b
provider: watsonx:chat:meta-llama/llama-3-3-70b-instruct
base_url: https://us-south.ml.cloud.ibm.com/ml/v1
secret_key_ref:
env: WATSONX_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
EU data residency (eu-gb region)
pack:
name: watsonx-providers-4
version: 1.0.0
enabled: true
providers:
targets:
- id: watsonx-eu-granite
provider: watsonx:chat:ibm/granite-3-3-8b-instruct
base_url: https://eu-gb.ml.cloud.ibm.com/ml/v1
secret_key_ref:
env: WATSONX_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Granite 2B for high-throughput classification
pack:
name: watsonx-providers-5
version: 1.0.0
enabled: true
providers:
targets:
- id: watsonx-granite-2b
provider: watsonx:chat:ibm/granite-3-3-2b-instruct
secret_key_ref:
env: WATSONX_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Start the gateway
export WATSONX_API_KEY="your-ibm-cloud-api-key"
export WATSONX_PROJECT_ID="your-watsonx-project-id"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
Provider Fields
| Field | Required | Default | Description |
|---|---|---|---|
provider | Yes | — | Provider identifier. Use "watsonx:chat:<model-id>" where the model ID follows WatsonX's <owner>/<model-name> format (e.g., ibm/granite-3-3-8b-instruct). |
secret_key_ref | Yes | WATSONX_API_KEY | Name of the env var holding the IBM Cloud IAM API key. |
accountIdentifierEnvar | Yes* | — | Name of the env var holding the WatsonX project ID. Use this for all non-hardcoded deployments. Alias: accountIdentifier_env. |
accountIdentifier | Yes* | — | Literal WatsonX project ID. Use accountIdentifierEnvar instead for secrets. Alias: accountIdentifierEnvar is preferred. |
base_url | No | https://us-south.ml.cloud.ibm.com/ml/v1 | WatsonX regional endpoint. Change to eu-gb.ml.cloud.ibm.com for EU data residency or jp-tok.ml.cloud.ibm.com for APAC. |
provider_type | No | watsonx | Forces the WatsonX runtime. Set explicitly when the provider string is ambiguous. |
format | No | openai | Wire format for request/response translation. WatsonX exposes an OpenAI-compatible surface; keep as openai. |
options.max_tokens | No | Model default | Maximum tokens in the completion. |
options.temperature | No | 0.3 | Sampling temperature (0–2). |
options.top_p | No | 1.0 | Nucleus sampling top-p. |
options.stop | No | — | Array of stop sequences. |
*One of accountIdentifier or accountIdentifierEnvar is required.
Supported Models
| Model ID | Context Window | Notes |
|----------|---------------|
| ibm/granite-3-3-8b-instruct | 128k | Best general-purpose IBM Granite model; strong instruction following and enterprise task performance |
| ibm/granite-3-3-2b-instruct | 128k | Smallest and fastest Granite; suited for classification, extraction, and high-throughput pipelines |
| meta-llama/llama-3-3-70b-instruct | 128k | Highest-quality model on WatsonX; recommended for complex reasoning and long-document tasks |
| mistralai/mistral-large | 128k | Mistral's flagship commercial model hosted on WatsonX; strong coding and multilingual performance |
| codellama/codellama-34b-instruct | 16k | Optimised for code generation and review; lower context than Granite alternatives |
| ibm/granite-20b-multilingual | 8k | 20+ language support; use for multilingual classification or extraction where Granite 3 is over-budget |
Unlike other providers, WatsonX requires a project ID on every API call in addition to the API key. The project ID identifies which WatsonX workspace the call is billed to and which data governance policies apply. You must supply it via accountIdentifier (literal) or accountIdentifierEnvar (env var reference) in every provider target.
Client Examples
- Python
- Node.js
- cURL
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="unused", # IBM Cloud auth is handled by the gateway
)
# Chat completion with Granite 3.3 8B
response = client.chat.completions.create(
model="ibm/granite-3-3-8b-instruct",
messages=[
{
"role": "system",
"content": "You are a precise enterprise assistant. Follow company policies.",
},
{
"role": "user",
"content": "Draft a brief executive summary of Q4 financial performance.",
},
],
max_tokens=1024,
temperature=0.3,
)
print(response.choices[0].message.content)
# High-throughput classification with Granite 2B
for item in items:
result = client.chat.completions.create(
model="ibm/granite-3-3-2b-instruct",
messages=[
{"role": "user", "content": f"Classify as POSITIVE, NEUTRAL, or NEGATIVE:\n{item}"},
],
max_tokens=10,
temperature=0.0,
)
print(result.choices[0].message.content.strip())
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "unused",
});
// Chat completion
const response = await client.chat.completions.create({
model: "ibm/granite-3-3-8b-instruct",
messages: [
{
role: "system",
content: "You are a precise enterprise assistant. Follow company policies.",
},
{
role: "user",
content: "Draft a brief executive summary of Q4 financial performance.",
},
],
max_tokens: 1024,
temperature: 0.3,
});
console.log(response.choices[0].message.content);
# Chat completion
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ibm/granite-3-3-8b-instruct",
"messages": [
{
"role": "system",
"content": "You are a precise enterprise assistant."
},
{
"role": "user",
"content": "Draft a brief executive summary of Q4 financial performance."
}
],
"max_tokens": 1024,
"temperature": 0.3
}'
Streaming
WatsonX.ai supports server-sent event (SSE) streaming. Keeptrusts forwards stream chunks after applying per-token policy checks. Set stream: true in your request:
- Python
- Node.js
- cURL
with client.chat.completions.stream(
model="meta-llama/llama-3-3-70b-instruct",
messages=[{"role": "user", "content": "Explain the IBM Cloud Shared Responsibility Model for AI services."}],
max_tokens=2048,
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
const stream = await client.chat.completions.stream({
model: "meta-llama/llama-3-3-70b-instruct",
messages: [
{
role: "user",
content: "Explain the IBM Cloud Shared Responsibility Model for AI services.",
},
],
max_tokens: 2048,
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content ?? "";
process.stdout.write(delta);
}
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3-3-70b-instruct",
"messages": [
{"role": "user", "content": "Explain the IBM Cloud Shared Responsibility Model for AI services."}
],
"max_tokens": 2048,
"stream": true
}'
Advanced Configuration
EU data residency
For GDPR-sensitive or EU AI Act-regulated workloads, route to the eu-gb WatsonX region. Data processed and stored in this region never leaves EU/UK data centres:
pack:
name: watsonx-providers-6
version: 1.0.0
enabled: true
providers:
targets:
- id: watsonx-eu-compliant
provider: watsonx:chat:ibm/granite-3-3-8b-instruct
base_url: https://eu-gb.ml.cloud.ibm.com/ml/v1
secret_key_ref:
env: WATSONX_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Available regional base URLs:
| Region | Base URL |
|---|---|
| US South (Dallas) | https://us-south.ml.cloud.ibm.com/ml/v1 |
| EU (London) | https://eu-gb.ml.cloud.ibm.com/ml/v1 |
| EU (Frankfurt) | https://eu-de.ml.cloud.ibm.com/ml/v1 |
| Asia Pacific (Tokyo) | https://jp-tok.ml.cloud.ibm.com/ml/v1 |
Financial compliance audit chain
WatsonX is commonly used in financial services. Pair it with the financial-compliance and extended-retention audit-logger policies to meet SOX and MiFID II record-keeping requirements:
policies:
chain:
- prompt-injection
- pii-detector
- financial-compliance
- audit-logger
policy:
financial-compliance:
regulations:
- sox
- mifid2
action: audit
alert_on_violation: true
audit-logger:
retention_days: 2555 # 7 years: SOX Section 802 requirement
include_request_metadata: true
include_policy_decisions: true
Multi-model routing for cost management
Use the fast, cheap Granite 2B model for simple classification tasks and route complex reasoning to Granite 8B or Llama 70B:
pack:
name: watsonx-providers-8
version: 1.0.0
enabled: true
providers:
targets:
- id: watsonx-fast
provider: watsonx:chat:ibm/granite-3-3-2b-instruct
secret_key_ref:
env: WATSONX_API_KEY
- id: watsonx-quality
provider: watsonx:chat:ibm/granite-3-3-8b-instruct
secret_key_ref:
env: WATSONX_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Best Practices
-
Always use
accountIdentifierEnvaroveraccountIdentifier— Never hardcode the WatsonX project ID in a committed policy config. UseaccountIdentifierEnvar: WATSONX_PROJECT_IDand inject the value through your secrets manager or CI/CD environment. Exposed project IDs combined with a leaked API key give an attacker full access to your WatsonX project and its billing. -
Use separate projects per environment — Create distinct WatsonX projects for development, staging, and production. This isolates billing, governance policies, and data access. Set a separate env var (e.g.,
WATSONX_PROJECT_ID_PROD) for each environment and reference it explicitly in the provider config. -
Choose the EU region for GDPR data — Any prompt containing personal data about EU residents must be processed in the
eu-gboreu-deregion. Usebase_url: "https://eu-gb.ml.cloud.ibm.com/ml/v1"and a dedicated EU project ID. Do not route EU personal data through the US South endpoint. -
Apply DLP filtering for IBM Cloud identifiers — Prompts and completions can contain IBM CRNs, API keys, and project UUIDs. Use the
dlp-filterpolicy to redactcrn:v1:...patterns and IAM key patterns before they appear in model output or audit logs. -
Use Granite 2B for high-throughput, Granite 8B+ for reasoning — IBM Granite 3.3 2B is extremely fast for classification, extraction, and routing decisions. Reserve Granite 8B and Llama 70B for tasks requiring reasoning, summarisation, or long-context understanding. This can reduce per-request cost by 4–8× on appropriate workloads.
-
Set 7-year audit retention for financial workloads — SOX Section 802 and MiFID II require financial records including AI-assisted decisions to be retained for 7 years (2555 days). Configure
audit-logger.retention_days: 2555and enableinclude_policy_decisions: trueso that every Keeptrusts allow/block decision is available for regulatory review.
For AI systems
- Canonical terms: Keeptrusts gateway, IBM WatsonX, watsonx.ai, IBM Cloud, IAM token, enterprise AI, provider target, policy-config.yaml,
provider: "watsonx". - Config field names:
provider,model,base_url,secret_key_ref.env,watsonx_project_id,format,provider_type: "watsonx",data_policy,audit-logger.retention_days. - Auth: IBM Cloud IAM API key → bearer token exchange.
- Key behavior: Keeptrusts translates between OpenAI format and WatsonX's native generation API, handling IAM token refresh.
- Best next pages: AWS Bedrock integration, Azure OpenAI integration, Policy configuration.
For engineers
- Prerequisites: IBM Cloud account with watsonx.ai provisioned, IAM API key, project ID,
ktCLI installed. - Required config:
watsonx_project_id, IBM Cloud IAM API key insecret_key_ref.env. - Start command:
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml. - Validate:
curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"ibm/granite-13b-chat-v2","messages":[{"role":"user","content":"hello"}]}'. - For SOX/MiFID II compliance, set
audit-logger.retention_days: 2555(7 years) withinclude_policy_decisions: true. - IBM IAM tokens expire — Keeptrusts handles automatic token refresh via the IAM API key.
For leaders
- IBM WatsonX targets enterprise and regulated industries — built-in compliance features align with financial services and healthcare requirements.
- 7-year audit retention (
retention_days: 2555) satisfies SOX Section 802 and MiFID II record-keeping requirements. - IBM's data handling agreements and regional deployment options address EU data residency and GDPR requirements.
- Keeptrusts policy enforcement layers on top of WatsonX's built-in AI governance for defense-in-depth compliance.
Next steps
- AWS Bedrock integration — alternative enterprise cloud AI with data residency
- Azure OpenAI integration — Microsoft cloud alternative for enterprise AI
- Cloudera integration — on-premise enterprise inference
- Policy configuration — audit-logger retention and compliance policy reference
- Quickstart — install
ktand run your first gateway