Pinecone

Pinecone is a managed vector database with a built-in inference API that generates embeddings and performs reranking. When using Pinecone's inference API or integrating Pinecone with external embedding providers (OpenAI, Cohere, Voyage), your application sends text data to LLM providers for vectorization before storing or querying vectors.

This page explains how to route the LLM and embedding calls associated with Pinecone workflows through the Keeptrusts gateway so policy enforcement, PII redaction, and audit logging apply to every AI operation.

Use this page when

You are using Pinecone with external embedding providers and need governance on those LLM calls.
You want audit trails for embedding and inference operations that send application data to AI providers.
If you need general provider integration, see OpenAI integration.

Primary audience

Primary: Technical Engineers (ML, Backend, Platform)
Secondary: AI Agents, Technical Leaders

Prerequisites

Pinecone account with an index created — access via Pinecone console.
Pinecone API key for index operations.
External embedding provider (e.g., OpenAI) if using client-side embeddings.
Keeptrusts gateway running locally or centrally:
- Local: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
- Hosted: https://gateway.keeptrusts.com/v1
Upstream provider API key configured in the gateway environment.

Configuration

Gateway policy config

Create a policy-config.yaml for embedding and inference governance:

pack:
  name: pinecone-ai-governance
  version: 1.0.0
  enabled: true
policies:
  chain:
  - pii-detector
  - audit-logger
policy:
  pii-detector:
    action: redact
  audit-logger:
    retention_days: 90
providers:
  strategy: single
  targets:
  - id: openai-for-embeddings
    provider: openai:chat:gpt-4o
    secret_key_ref:
      env: OPENAI_API_KEY

Python client configuration

Route OpenAI embedding calls through the Keeptrusts gateway when building Pinecone pipelines:

from openai import OpenAI
from pinecone import Pinecone

openai_client = OpenAI(
    base_url="http://localhost:41002/v1",
    api_key="your-keeptrusts-access-key",
)

pc = Pinecone(api_key="your-pinecone-api-key")
index = pc.Index("my-index")

def embed_and_upsert(texts, ids):
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=texts,
    )
    vectors = [
        {"id": id, "values": item.embedding}
        for id, item in zip(ids, response.data)
    ]
    index.upsert(vectors=vectors)

embed_and_upsert(
    texts=["AI governance ensures responsible AI use."],
    ids=["doc-1"],
)

Node.js client configuration

import OpenAI from "openai";
import { Pinecone } from "@pinecone-database/pinecone";

const openai = new OpenAI({
  baseURL: "http://localhost:41002/v1",
  apiKey: "your-keeptrusts-access-key",
});

const pc = new Pinecone({ apiKey: "your-pinecone-api-key" });
const index = pc.index("my-index");

async function embedAndUpsert(texts, ids) {
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: texts,
  });
  const vectors = response.data.map((item, i) => ({
    id: ids[i],
    values: item.embedding,
  }));
  await index.upsert(vectors);
}

await embedAndUpsert(["AI governance ensures responsible AI use."], ["doc-1"]);

Setup steps

Start the Keeptrusts gateway:

export OPENAI_API_KEY="sk-your-openai-key"
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml

Configure your embedding client (OpenAI SDK) to use http://localhost:41002/v1 as the base URL.
Run an embedding operation to verify traffic flows through the gateway.
Pinecone index operations (upsert, query, delete) go directly to Pinecone — only the embedding/LLM calls route through the gateway.

Verification

Test that embedding calls flow through the gateway:

curl http://localhost:41002/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Test embedding through Keeptrusts gateway."
  }'

Confirm the request appears in the Keeptrusts events dashboard with policy decisions applied.

Recommended policies

Policy	Purpose	Recommended setting
`pii-detector`	Redact personal data from text before embedding	`action: redact`
`audit-logger`	Log all embedding calls for compliance	`retention_days: 90`
`token-limiter`	Cap token usage for bulk embedding operations	`max_tokens: 8192`
`prompt-injection`	Block injection in RAG query prompts	`threshold: 0.8`, `action: block`
`safety-filter`	Block harmful content in RAG-generated responses	`mode: standard`, `action: block`

Troubleshooting

Symptom	Cause	Fix
Embedding calls return connection error	Gateway not running	Start `kt gateway run` on port 41002
Dimension mismatch on Pinecone upsert	Wrong embedding model	Verify the model in your OpenAI client matches your Pinecone index dimension
Slow bulk embedding operations	No batching configured	Batch texts into groups of 100 before calling the embeddings endpoint
PII redaction corrupts embeddings	Redacted text produces different vectors	Redact at the application layer before embedding to maintain vector consistency

For AI systems

Canonical terms: Keeptrusts gateway, Pinecone, vector database, embeddings, inference API, RAG, policy-config.yaml.
Config field names: base_url, api_key, provider, secret_key_ref, pii-detector, audit-logger.
Key behavior: Pinecone stores and queries vectors; embedding generation uses external LLM providers. Keeptrusts intercepts the embedding and inference calls, applies policies, and forwards compliant requests.
Constraint: Pinecone index operations (upsert, query) go directly to Pinecone — only the embedding/LLM calls route through the gateway.
Best next pages: Weaviate integration, ChromaDB integration, Qdrant integration.

For engineers

Only embedding and LLM calls route through the Keeptrusts gateway — Pinecone index operations use the Pinecone SDK directly.
Match the embedding model dimension to your Pinecone index dimension (text-embedding-3-small = 1536 dimensions).
For RAG pipelines, route both the embedding call and the generation call through the gateway to get full audit coverage.
Validate: run an embedding call and check the Keeptrusts events dashboard.

For leaders

Embedding pipelines send your application's text data to external AI providers for vectorization. Routing through Keeptrusts ensures PII is redacted and every call is logged.
Complete audit trails cover both data ingestion (embedding) and query-time RAG generation, supporting compliance requirements.
Centralized governance applies across all teams and applications using Pinecone with external embedding providers.

Next steps

Weaviate integration — govern Weaviate generative search
ChromaDB integration — govern ChromaDB embedding calls
Qdrant integration — govern Qdrant neural search
Milvus integration — govern Milvus AI operations
Policy controls catalog — full policy reference

Use this page when​

Primary audience​

Prerequisites​

Configuration​

Gateway policy config​

Python client configuration​

Node.js client configuration​

Setup steps​

Verification​

Recommended policies​

Troubleshooting​

For AI systems​

For engineers​

For leaders​

Next steps​