Perplexity AI
Perplexity AI provides a family of online and offline language models purpose-built for search-augmented generation. The sonar family retrieves live web sources at inference time and returns answers with inline citations, while r1-1776 is an offline reasoning model suitable for sensitive workloads where live retrieval is not appropriate. Keeptrusts wraps the Perplexity API with policy enforcement so you can redact PII from search-grounded outputs, verify citation quality, and maintain a complete audit trail for research workflows.
Use this page when
- You need the exact command, config, API, or integration details for Perplexity AI.
- You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
- If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.
Primary audience
- Primary: AI Agents, Technical Engineers
- Secondary: Technical Leaders
Prerequisites
- A Perplexity AI API key (
PERPLEXITY_API_KEY) ktCLI installed and authenticated (kt auth login)
Set your key before starting the gateway:
export PERPLEXITY_API_KEY="pplx-..."
Configuration
Minimal — single online model
pack:
name: perplexity-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Full governance config
pack:
name: perplexity-research
version: 1.0.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
- citation-verifier
- content-filter
- audit-logger
policy:
pii-detector:
action: redact
entities:
- PERSON
- EMAIL_ADDRESS
- PHONE_NUMBER
- CREDIT_CARD
citation-verifier:
require_sources: true
min_grounded_ratio: 0.8
action_on_failure: warn
content-filter:
categories:
- hate_speech
- harassment
- self_harm
action: block
providers:
targets:
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-sonar
provider: perplexity:chat:sonar
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-deep-research
provider: perplexity:chat:sonar-deep-research
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-reasoning-pro
provider: perplexity:chat:sonar-reasoning-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-offline
provider: perplexity:chat:r1-1776
secret_key_ref:
env: PERPLEXITY_API_KEY
Provider Fields
| Field | Required | Description |
|---|---|---|
provider | Yes | "perplexity" or "perplexity:chat:{model-id}" |
secret_key_ref | Yes | Environment variable holding the Perplexity API key (e.g. PERPLEXITY_API_KEY) |
base_url | No | Defaults to https://api.perplexity.ai — override only for proxied or on-prem deployments |
model | No | Model ID when using the bare "perplexity" provider |
format | No | "openai" (Perplexity exposes an OpenAI-compatible endpoint) |
stream_timeout_seconds | No | Increase for sonar-deep-research (300+) which performs multi-step retrieval |
Supported Models
| Model | Context | Search | Input (per 1M) | Output (per 1M) | Notes |
|---|---|---|---|---|---|
sonar-pro | 127k | Live web | $3.00 | $15.00 | Advanced reasoning + citations; recommended default |
sonar | 127k | Live web | $1.00 | $1.00 | Fast, cost-efficient search-augmented generation |
sonar-deep-research | 127k | Multi-step | $2.00 | $8.00 | Autonomous research; completes in 30s–5min |
sonar-reasoning-pro | 127k | Live web | $2.00 | $8.00 | Chain-of-thought reasoning with search grounding |
sonar-reasoning | 127k | Live web | $1.00 | $5.00 | Reasoning with citations at lower cost |
r1-1776 | 128k | None (offline) | $2.00 | $8.00 | No web retrieval; safe for sensitive/ZDR workloads |
Note on online models and data policy —
sonar,sonar-pro,sonar-reasoning, andsonar-reasoning-properform live web retrieval at request time. This is incompatible withzero_data_retention: truebecause retrieval inherently externalises query context. User1-1776when ZDR is required.
Client Examples
Start the gateway:
export PERPLEXITY_API_KEY="pplx-..."
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
- Python
- Node.js
- cURL
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="unused", # auth handled by Keeptrusts
)
# Standard sonar-pro search-augmented query
response = client.chat.completions.create(
model="sonar-pro",
messages=[
{
"role": "system",
"content": "Be precise and concise. Always cite your sources.",
},
{
"role": "user",
"content": "What EU AI Act obligations take effect in August 2025?",
},
],
max_tokens=1024,
)
print(response.choices[0].message.content)
# Offline model for sensitive input
offline = client.chat.completions.create(
model="r1-1776",
messages=[
{"role": "user", "content": "Analyse the following internal policy document..."}
],
max_tokens=2048,
)
print(offline.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:41002/v1",
apiKey: "unused",
});
// Standard sonar-pro query
const response = await client.chat.completions.create({
model: "sonar-pro",
messages: [
{
role: "system",
content: "Be precise and concise. Always cite your sources.",
},
{
role: "user",
content: "What EU AI Act obligations take effect in August 2025?",
},
],
max_tokens: 1024,
});
console.log(response.choices[0].message.content);
// Deep research — patience required
const research = await client.chat.completions.create({
model: "sonar-deep-research",
messages: [
{
role: "user",
content:
"Produce a comprehensive analysis of GDPR enforcement actions in 2024, including fines, responsible authorities, and precedents set.",
},
],
max_tokens: 4096,
});
console.log(research.choices[0].message.content);
# sonar-pro search-augmented query
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "sonar-pro",
"messages": [
{"role": "system", "content": "Be precise and concise. Always cite your sources."},
{"role": "user", "content": "What EU AI Act obligations take effect in August 2025?"}
],
"max_tokens": 1024
}' | jq .choices[0].message.content
# Offline reasoning (sensitive data safe)
curl -s http://localhost:41002/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "r1-1776",
"messages": [
{"role": "user", "content": "Analyse the following internal policy document..."}
],
"max_tokens": 2048
}' | jq .choices[0].message.content
Streaming
All Perplexity models support streaming. For sonar-deep-research, streaming is particularly important because multi-step retrieval can take several minutes — streaming lets you display progress tokens as they arrive rather than blocking the client.
from openai import OpenAI
client = OpenAI(base_url="http://localhost:41002/v1", api_key="unused")
with client.chat.completions.stream(
model="sonar-deep-research",
messages=[
{
"role": "user",
"content": "Produce a comprehensive analysis of AI governance regulations enacted globally in 2024.",
}
],
max_tokens=8192,
) as stream:
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
print() # newline after stream
Set stream_timeout_seconds appropriately per model in your config:
pack:
name: perplexity-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: perplexity-deep-research
provider: perplexity:chat:sonar-deep-research
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
Advanced Configuration
Routing sensitive queries to the offline model
For workflows that mix public research queries with sensitive internal analysis, use Keeptrusts's routing policy to direct queries containing classified terms to r1-1776 and public queries to sonar-pro:
policies:
chain:
- content-classifier
- router
- pii-detector
- audit-logger
policy:
content-classifier:
labels:
sensitive:
keywords:
- "internal"
- "confidential"
- "proprietary"
- "classified"
public:
default: true
router:
rules:
- when_label: "sensitive"
target: "perplexity-offline"
- when_label: "public"
target: "perplexity-sonar-pro"
providers:
targets:
- id: "perplexity-sonar-pro"
provider: "perplexity:chat:sonar-pro"
secret_key_ref:
env: "PERPLEXITY_API_KEY"
- id: "perplexity-offline"
provider: "perplexity:chat:r1-1776"
secret_key_ref:
env: "PERPLEXITY_API_KEY"
Citation verification for compliance workflows
Research outputs used in compliance, legal, or regulatory filings require grounded citations. Combine citation-verifier with response auditing so rejected responses are logged:
policy:
citation-verifier:
require_sources: true
require_source_match: true
min_groundedness: 0.85
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
pack:
name: perplexity-example-5
version: 1.0.0
enabled: true
policies:
chain:
- citation-verifier
- audit-logger
Cost controls with model tiering
Perplexity's sonar is 3× cheaper than sonar-pro. Use model tiering to route simple factual queries to sonar and complex multi-document research to sonar-pro or sonar-deep-research:
policy:
cost-guard:
tiers:
low:
max_tokens_per_request: 512
target: perplexity-sonar
medium:
max_tokens_per_request: 2048
target: perplexity-sonar-pro
high:
max_tokens_per_request: 8192
target: perplexity-deep-research
requires_role: researcher
providers:
targets:
- id: perplexity-sonar
provider: perplexity:chat:sonar
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-deep-research
provider: perplexity:chat:sonar-deep-research
secret_key_ref:
env: PERPLEXITY_API_KEY
pack:
name: perplexity-example-6
version: 1.0.0
enabled: true
policies:
chain:
- cost-guard
Best Practices
-
Do not send sensitive data to online models —
sonar,sonar-pro,sonar-reasoning, andsonar-deep-researchsend the query to Perplexity's retrieval infrastructure. User1-1776for internal documents, PII-bearing queries, or any workload governed by a ZDR policy. -
Set
stream_timeout_secondsper model tier —sonar-deep-researchcan run for 3–5 minutes. Without an adequate timeout, Keeptrusts will terminate the stream prematurely. Set at least 300 seconds for deep-research targets and 60 seconds for sonar-pro. -
Enable citation verification for compliance outputs — Research responses used in evidence packages, regulatory filings, or legal briefs should pass through
citation-verifierwithmin_grounded_ratio: 0.85. Log failures for reviewer escalation rather than silently accepting ungrounded answers. -
Redact PII before the query reaches the model — Perplexity online models may cite sources that include query terms. A name or email in a prompt could appear in a cited page. Apply
pii-detectoron the request path to remove identifiers before they become part of the search query. -
Use
sonarfor high-volume applications —sonarcosts $1/1M tokens for both input and output — 15× cheaper thansonar-proon output. For chatbots or assistants that don't require deep citation analysis,sonaris the right default tier. -
Log the full response including citations — Perplexity returns source URLs in the message content. Enable
include_response: trueinaudit-loggerso your audit trail captures the cited sources, not just the model's answer. This is essential for tracing back AI-generated claims in regulated industries.
For AI systems
- Canonical terms: Keeptrusts gateway, Perplexity AI, Perplexity, online models, real-time search, citations, provider target, policy-config.yaml,
provider: "perplexity". - Config field names:
provider,model,base_url: "https://api.perplexity.ai",secret_key_ref.env: "PERPLEXITY_API_KEY",format: "openai",pricing. - Key behavior: Perplexity returns source URLs in message content — responses include citations that can be verified.
- Best next pages: Cohere integration (RAG/citations), OpenAI integration, Policy configuration.
For engineers
- Prerequisites: Perplexity API key (
PERPLEXITY_API_KEYenv var from perplexity.ai),ktCLI installed. - Start command:
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml. - Validate:
curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"sonar-pro","messages":[{"role":"user","content":"What is the latest news on AI regulation?"}]}'. - Enable
include_response: trueinaudit-loggerto capture cited source URLs in the audit trail — essential for tracing AI-generated claims. - Perplexity uses OpenAI-compatible API — standard OpenAI SDKs work without modification.
- Online models have variable latency depending on search complexity — set appropriate
timeout_seconds.
For leaders
- Perplexity's citation-backed responses provide verifiable AI outputs — critical for regulated industries where claims must be traceable.
- Real-time search means responses reflect current information — reduces hallucination risk for time-sensitive queries.
- Audit trail should capture full responses including citations (
include_response: true) for compliance evidence. - Online model latency is less predictable than offline models — set timeout and fallback strategies accordingly.
Next steps
- Cohere integration — alternative citation-aware provider with RAG support
- OpenAI integration — offline models for deterministic workloads
- Provider routing strategies — fallback from online to offline models
- Policy configuration — audit-logger and citation-verifier reference
- Quickstart — install
ktand run your first gateway