IBM WatsonX

IBM WatsonX.ai provides foundation models — including IBM's own Granite series and third-party models like Llama and Mistral — on IBM Cloud infrastructure with enterprise governance and EU-hosted data residency options. Keeptrusts gateways WatsonX using IBM Cloud API key authentication, applying prompt-injection filtering, PII redaction, compliance auditing, and DLP controls before requests leave your environment. Client code uses the standard OpenAI format; Keeptrusts handles the WatsonX-specific API translation transparently.

Use this page when

You need the exact command, config, API, or integration details for IBM WatsonX.
You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.

Primary audience

Primary: AI Agents, Technical Engineers
Secondary: Technical Leaders

Prerequisites

An IBM Cloud account with WatsonX.ai provisioned
An IBM Cloud API key (IAM) for a service ID with Editor access to your WatsonX project
A WatsonX project ID visible in the WatsonX.ai console under Manage → General
Keeptrusts CLI (kt) installed and on your PATH
WATSONX_API_KEY and the project ID exported in your shell or injected via your secrets manager

Configuration

Minimal configuration

pack:
  name: watsonx-providers-1
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: watsonx-granite
    provider: watsonx:chat:ibm/granite-3-3-8b-instruct
    secret_key_ref:
      env: WATSONX_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Full named configuration with policy chain

pack:
  name: watsonx-enterprise
  version: 1.0.0
  enabled: true
policies:
  chain:
  - prompt-injection
  - pii-detector
  - dlp-filter
  - financial-compliance
  - audit-logger
policy:
  prompt-injection:
    threshold: 0.8
    action: block
  pii-detector:
    action: redact
    entities:
    - EMAIL
    - PHONE
    - SSN
    - CREDIT_CARD
    - ADDRESS
  dlp-filter:
    patterns:
    - name: ibm-api-key
      regex: IAM[A-Za-z0-9_-]{32,}
      action: redact
    - name: ibm-crn
      regex: crn:v1:[a-z]+:[a-z]+:[a-z0-9-]+:[a-z0-9-]+:[a-zA-Z0-9/:-]+
      action: redact
  financial-compliance:
    regulations:
    - sox
    action: audit
  audit-logger:
    retention_days: 2555
providers:
  targets:
  - id: watsonx-granite-8b
    provider: watsonx:chat:ibm/granite-3-3-8b-instruct
    base_url: https://us-south.ml.cloud.ibm.com/ml/v1
    secret_key_ref:
      env: WATSONX_API_KEY

Llama 3.3 70B on WatsonX (us-south)

pack:
  name: watsonx-providers-3
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: watsonx-llama-70b
    provider: watsonx:chat:meta-llama/llama-3-3-70b-instruct
    base_url: https://us-south.ml.cloud.ibm.com/ml/v1
    secret_key_ref:
      env: WATSONX_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

EU data residency (eu-gb region)

pack:
  name: watsonx-providers-4
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: watsonx-eu-granite
    provider: watsonx:chat:ibm/granite-3-3-8b-instruct
    base_url: https://eu-gb.ml.cloud.ibm.com/ml/v1
    secret_key_ref:
      env: WATSONX_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Granite 2B for high-throughput classification

pack:
  name: watsonx-providers-5
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: watsonx-granite-2b
    provider: watsonx:chat:ibm/granite-3-3-2b-instruct
    secret_key_ref:
      env: WATSONX_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Start the gateway

export WATSONX_API_KEY="your-ibm-cloud-api-key"
export WATSONX_PROJECT_ID="your-watsonx-project-id"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml

Provider Fields

Field	Required	Default	Description
`provider`	Yes	—	Provider identifier. Use `"watsonx:chat:<model-id>"` where the model ID follows WatsonX's `<owner>/<model-name>` format (e.g., `ibm/granite-3-3-8b-instruct`).
`secret_key_ref`	Yes	`WATSONX_API_KEY`	Name of the env var holding the IBM Cloud IAM API key.
`accountIdentifierEnvar`	Yes*	—	Name of the env var holding the WatsonX project ID. Use this for all non-hardcoded deployments. Alias: `accountIdentifier_env`.
`accountIdentifier`	Yes*	—	Literal WatsonX project ID. Use `accountIdentifierEnvar` instead for secrets. Alias: `accountIdentifierEnvar` is preferred.
`base_url`	No	`https://us-south.ml.cloud.ibm.com/ml/v1`	WatsonX regional endpoint. Change to `eu-gb.ml.cloud.ibm.com` for EU data residency or `jp-tok.ml.cloud.ibm.com` for APAC.
`provider_type`	No	`watsonx`	Forces the WatsonX runtime. Set explicitly when the provider string is ambiguous.
`format`	No	`openai`	Wire format for request/response translation. WatsonX exposes an OpenAI-compatible surface; keep as `openai`.
`options.max_tokens`	No	Model default	Maximum tokens in the completion.
`options.temperature`	No	`0.3`	Sampling temperature (0–2).
`options.top_p`	No	`1.0`	Nucleus sampling top-p.
`options.stop`	No	—	Array of stop sequences.

*One of accountIdentifier or accountIdentifierEnvar is required.

Supported Models

| Model ID | Context Window | Notes | |----------|---------------| | ibm/granite-3-3-8b-instruct | 128k | Best general-purpose IBM Granite model; strong instruction following and enterprise task performance | | ibm/granite-3-3-2b-instruct | 128k | Smallest and fastest Granite; suited for classification, extraction, and high-throughput pipelines | | meta-llama/llama-3-3-70b-instruct | 128k | Highest-quality model on WatsonX; recommended for complex reasoning and long-document tasks | | mistralai/mistral-large | 128k | Mistral's flagship commercial model hosted on WatsonX; strong coding and multilingual performance | | codellama/codellama-34b-instruct | 16k | Optimised for code generation and review; lower context than Granite alternatives | | ibm/granite-20b-multilingual | 8k | 20+ language support; use for multilingual classification or extraction where Granite 3 is over-budget |

Project ID required

Unlike other providers, WatsonX requires a project ID on every API call in addition to the API key. The project ID identifies which WatsonX workspace the call is billed to and which data governance policies apply. You must supply it via accountIdentifier (literal) or accountIdentifierEnvar (env var reference) in every provider target.

Client Examples

Python
Node.js
cURL

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="unused",  # IBM Cloud auth is handled by the gateway
)

# Chat completion with Granite 3.3 8B
response = client.chat.completions.create(
    model="ibm/granite-3-3-8b-instruct",
    messages=[
        {
            "role": "system",
            "content": "You are a precise enterprise assistant. Follow company policies.",
        },
        {
            "role": "user",
            "content": "Draft a brief executive summary of Q4 financial performance.",
        },
    ],
    max_tokens=1024,
    temperature=0.3,
)
print(response.choices[0].message.content)

# High-throughput classification with Granite 2B
for item in items:
    result = client.chat.completions.create(
        model="ibm/granite-3-3-2b-instruct",
        messages=[
            {"role": "user", "content": f"Classify as POSITIVE, NEUTRAL, or NEGATIVE:\n{item}"},
        ],
        max_tokens=10,
        temperature=0.0,
    )
    print(result.choices[0].message.content.strip())

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:8080/v1",
  apiKey: "unused",
});

// Chat completion
const response = await client.chat.completions.create({
  model: "ibm/granite-3-3-8b-instruct",
  messages: [
    {
      role: "system",
      content: "You are a precise enterprise assistant. Follow company policies.",
    },
    {
      role: "user",
      content: "Draft a brief executive summary of Q4 financial performance.",
    },
  ],
  max_tokens: 1024,
  temperature: 0.3,
});
console.log(response.choices[0].message.content);

# Chat completion
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibm/granite-3-3-8b-instruct",
    "messages": [
      {
        "role": "system",
        "content": "You are a precise enterprise assistant."
      },
      {
        "role": "user",
        "content": "Draft a brief executive summary of Q4 financial performance."
      }
    ],
    "max_tokens": 1024,
    "temperature": 0.3
  }'

Streaming

WatsonX.ai supports server-sent event (SSE) streaming. Keeptrusts forwards stream chunks after applying per-token policy checks. Set stream: true in your request:

Python
Node.js
cURL

with client.chat.completions.stream(
    model="meta-llama/llama-3-3-70b-instruct",
    messages=[{"role": "user", "content": "Explain the IBM Cloud Shared Responsibility Model for AI services."}],
    max_tokens=2048,
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

const stream = await client.chat.completions.stream({
  model: "meta-llama/llama-3-3-70b-instruct",
  messages: [
    {
      role: "user",
      content: "Explain the IBM Cloud Shared Responsibility Model for AI services.",
    },
  ],
  max_tokens: 2048,
});

for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content ?? "";
  process.stdout.write(delta);
}

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3-3-70b-instruct",
    "messages": [
      {"role": "user", "content": "Explain the IBM Cloud Shared Responsibility Model for AI services."}
    ],
    "max_tokens": 2048,
    "stream": true
  }'

Advanced Configuration

EU data residency

For GDPR-sensitive or EU AI Act-regulated workloads, route to the eu-gb WatsonX region. Data processed and stored in this region never leaves EU/UK data centres:

pack:
  name: watsonx-providers-6
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: watsonx-eu-compliant
    provider: watsonx:chat:ibm/granite-3-3-8b-instruct
    base_url: https://eu-gb.ml.cloud.ibm.com/ml/v1
    secret_key_ref:
      env: WATSONX_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Available regional base URLs:

Region	Base URL
US South (Dallas)	`https://us-south.ml.cloud.ibm.com/ml/v1`
EU (London)	`https://eu-gb.ml.cloud.ibm.com/ml/v1`
EU (Frankfurt)	`https://eu-de.ml.cloud.ibm.com/ml/v1`
Asia Pacific (Tokyo)	`https://jp-tok.ml.cloud.ibm.com/ml/v1`

Financial compliance audit chain

WatsonX is commonly used in financial services. Pair it with the financial-compliance and extended-retention audit-logger policies to meet SOX and MiFID II record-keeping requirements:

policies:
  chain:
    - prompt-injection
    - pii-detector
    - financial-compliance
    - audit-logger

policy:
  financial-compliance:
    regulations:
      - sox
      - mifid2
    action: audit
    alert_on_violation: true

  audit-logger:
    retention_days: 2555         # 7 years: SOX Section 802 requirement
    include_request_metadata: true
    include_policy_decisions: true

Multi-model routing for cost management

Use the fast, cheap Granite 2B model for simple classification tasks and route complex reasoning to Granite 8B or Llama 70B:

pack:
  name: watsonx-providers-8
  version: 1.0.0
  enabled: true
providers:
  targets:
  - id: watsonx-fast
    provider: watsonx:chat:ibm/granite-3-3-2b-instruct
    secret_key_ref:
      env: WATSONX_API_KEY
  - id: watsonx-quality
    provider: watsonx:chat:ibm/granite-3-3-8b-instruct
    secret_key_ref:
      env: WATSONX_API_KEY
policies:
  chain:
  - audit-logger
policy:
  audit-logger:
    immutable: true
    retention_days: 365
    log_all_access: true

Best Practices

Always use accountIdentifierEnvar over accountIdentifier — Never hardcode the WatsonX project ID in a committed policy config. Use accountIdentifierEnvar: WATSONX_PROJECT_ID and inject the value through your secrets manager or CI/CD environment. Exposed project IDs combined with a leaked API key give an attacker full access to your WatsonX project and its billing.
Use separate projects per environment — Create distinct WatsonX projects for development, staging, and production. This isolates billing, governance policies, and data access. Set a separate env var (e.g., WATSONX_PROJECT_ID_PROD) for each environment and reference it explicitly in the provider config.
Choose the EU region for GDPR data — Any prompt containing personal data about EU residents must be processed in the eu-gb or eu-de region. Use base_url: "https://eu-gb.ml.cloud.ibm.com/ml/v1" and a dedicated EU project ID. Do not route EU personal data through the US South endpoint.
Apply DLP filtering for IBM Cloud identifiers — Prompts and completions can contain IBM CRNs, API keys, and project UUIDs. Use the dlp-filter policy to redact crn:v1:... patterns and IAM key patterns before they appear in model output or audit logs.
Use Granite 2B for high-throughput, Granite 8B+ for reasoning — IBM Granite 3.3 2B is extremely fast for classification, extraction, and routing decisions. Reserve Granite 8B and Llama 70B for tasks requiring reasoning, summarisation, or long-context understanding. This can reduce per-request cost by 4–8× on appropriate workloads.
Set 7-year audit retention for financial workloads — SOX Section 802 and MiFID II require financial records including AI-assisted decisions to be retained for 7 years (2555 days). Configure audit-logger.retention_days: 2555 and enable include_policy_decisions: true so that every Keeptrusts allow/block decision is available for regulatory review.

For AI systems

Canonical terms: Keeptrusts gateway, IBM WatsonX, watsonx.ai, IBM Cloud, IAM token, enterprise AI, provider target, policy-config.yaml, provider: "watsonx".
Config field names: provider, model, base_url, secret_key_ref.env, watsonx_project_id, format, provider_type: "watsonx", data_policy, audit-logger.retention_days.
Auth: IBM Cloud IAM API key → bearer token exchange.
Key behavior: Keeptrusts translates between OpenAI format and WatsonX's native generation API, handling IAM token refresh.
Best next pages: AWS Bedrock integration, Azure OpenAI integration, Policy configuration.

For engineers

Prerequisites: IBM Cloud account with watsonx.ai provisioned, IAM API key, project ID, kt CLI installed.
Required config: watsonx_project_id, IBM Cloud IAM API key in secret_key_ref.env.
Start command: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml.
Validate: curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"ibm/granite-13b-chat-v2","messages":[{"role":"user","content":"hello"}]}'.
For SOX/MiFID II compliance, set audit-logger.retention_days: 2555 (7 years) with include_policy_decisions: true.
IBM IAM tokens expire — Keeptrusts handles automatic token refresh via the IAM API key.

For leaders

IBM WatsonX targets enterprise and regulated industries — built-in compliance features align with financial services and healthcare requirements.
7-year audit retention (retention_days: 2555) satisfies SOX Section 802 and MiFID II record-keeping requirements.
IBM's data handling agreements and regional deployment options address EU data residency and GDPR requirements.
Keeptrusts policy enforcement layers on top of WatsonX's built-in AI governance for defense-in-depth compliance.

Next steps

AWS Bedrock integration — alternative enterprise cloud AI with data residency
Azure OpenAI integration — Microsoft cloud alternative for enterprise AI
Cloudera integration — on-premise enterprise inference
Policy configuration — audit-logger retention and compliance policy reference
Quickstart — install kt and run your first gateway

Use this page when​

Primary audience​

Prerequisites​

Configuration​

Minimal configuration​

Full named configuration with policy chain​

Llama 3.3 70B on WatsonX (us-south)​

EU data residency (eu-gb region)​

Granite 2B for high-throughput classification​

Start the gateway​

Provider Fields​

Supported Models​

Client Examples​

Streaming​

Advanced Configuration​

EU data residency​

Financial compliance audit chain​

Multi-model routing for cost management​

Best Practices​

For AI systems​

For engineers​

For leaders​

Next steps​