Skip to main content
Browse docs
By Audience
Getting Started
Configuration
Use Cases
IDE Integration
Third-Party Integrations
Engineering Cache
Console
API Reference
Gateway
Workflow Guides
Templates
Providers and SDKs
Industry Guides
Advanced Guides
Browse by Role
Deployment Guides
In-Depth Guides
Tutorials
FAQ

Perplexity AI

Perplexity AI provides a family of online and offline language models purpose-built for search-augmented generation. The sonar family retrieves live web sources at inference time and returns answers with inline citations, while r1-1776 is an offline reasoning model suitable for sensitive workloads where live retrieval is not appropriate. Keeptrusts wraps the Perplexity API with policy enforcement so you can redact PII from search-grounded outputs, verify citation quality, and maintain a complete audit trail for research workflows.

Use this page when

  • You need the exact command, config, API, or integration details for Perplexity AI.
  • You are wiring automation or AI retrieval and need canonical names, examples, and constraints.
  • If you want a guided rollout instead of a reference page, use the linked workflow pages in Next steps.

Primary audience

  • Primary: AI Agents, Technical Engineers
  • Secondary: Technical Leaders

Prerequisites

Set your key before starting the gateway:

export PERPLEXITY_API_KEY="pplx-..."

Configuration

Minimal — single online model

pack:
name: perplexity-providers-1
version: 1.0.0
enabled: true
providers:
targets:
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Full governance config

pack:
name: perplexity-research
version: 1.0.0
enabled: true
policies:
chain:
- prompt-injection
- pii-detector
- citation-verifier
- content-filter
- audit-logger
policy:
pii-detector:
action: redact
entities:
- PERSON
- EMAIL_ADDRESS
- PHONE_NUMBER
- CREDIT_CARD
citation-verifier:
require_sources: true
min_grounded_ratio: 0.8
action_on_failure: warn
content-filter:
categories:
- hate_speech
- harassment
- self_harm
action: block
providers:
targets:
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-sonar
provider: perplexity:chat:sonar
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-deep-research
provider: perplexity:chat:sonar-deep-research
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-reasoning-pro
provider: perplexity:chat:sonar-reasoning-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-offline
provider: perplexity:chat:r1-1776
secret_key_ref:
env: PERPLEXITY_API_KEY

Provider Fields

FieldRequiredDescription
providerYes"perplexity" or "perplexity:chat:{model-id}"
secret_key_refYesEnvironment variable holding the Perplexity API key (e.g. PERPLEXITY_API_KEY)
base_urlNoDefaults to https://api.perplexity.ai — override only for proxied or on-prem deployments
modelNoModel ID when using the bare "perplexity" provider
formatNo"openai" (Perplexity exposes an OpenAI-compatible endpoint)
stream_timeout_secondsNoIncrease for sonar-deep-research (300+) which performs multi-step retrieval

Supported Models

ModelContextSearchInput (per 1M)Output (per 1M)Notes
sonar-pro127kLive web$3.00$15.00Advanced reasoning + citations; recommended default
sonar127kLive web$1.00$1.00Fast, cost-efficient search-augmented generation
sonar-deep-research127kMulti-step$2.00$8.00Autonomous research; completes in 30s–5min
sonar-reasoning-pro127kLive web$2.00$8.00Chain-of-thought reasoning with search grounding
sonar-reasoning127kLive web$1.00$5.00Reasoning with citations at lower cost
r1-1776128kNone (offline)$2.00$8.00No web retrieval; safe for sensitive/ZDR workloads

Note on online models and data policysonar, sonar-pro, sonar-reasoning, and sonar-reasoning-pro perform live web retrieval at request time. This is incompatible with zero_data_retention: true because retrieval inherently externalises query context. Use r1-1776 when ZDR is required.

Client Examples

Start the gateway:

export PERPLEXITY_API_KEY="pplx-..."
kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml
from openai import OpenAI

client = OpenAI(
base_url="http://localhost:41002/v1",
api_key="unused", # auth handled by Keeptrusts
)

# Standard sonar-pro search-augmented query
response = client.chat.completions.create(
model="sonar-pro",
messages=[
{
"role": "system",
"content": "Be precise and concise. Always cite your sources.",
},
{
"role": "user",
"content": "What EU AI Act obligations take effect in August 2025?",
},
],
max_tokens=1024,
)
print(response.choices[0].message.content)

# Offline model for sensitive input
offline = client.chat.completions.create(
model="r1-1776",
messages=[
{"role": "user", "content": "Analyse the following internal policy document..."}
],
max_tokens=2048,
)
print(offline.choices[0].message.content)

Streaming

All Perplexity models support streaming. For sonar-deep-research, streaming is particularly important because multi-step retrieval can take several minutes — streaming lets you display progress tokens as they arrive rather than blocking the client.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:41002/v1", api_key="unused")

with client.chat.completions.stream(
model="sonar-deep-research",
messages=[
{
"role": "user",
"content": "Produce a comprehensive analysis of AI governance regulations enacted globally in 2024.",
}
],
max_tokens=8192,
) as stream:
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
print() # newline after stream

Set stream_timeout_seconds appropriately per model in your config:

pack:
name: perplexity-providers-3
version: 1.0.0
enabled: true
providers:
targets:
- id: perplexity-deep-research
provider: perplexity:chat:sonar-deep-research
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
policies:
chain:
- audit-logger
policy:
audit-logger:
immutable: true
retention_days: 365
log_all_access: true

Advanced Configuration

Routing sensitive queries to the offline model

For workflows that mix public research queries with sensitive internal analysis, use Keeptrusts's routing policy to direct queries containing classified terms to r1-1776 and public queries to sonar-pro:

policies:
chain:
- content-classifier
- router
- pii-detector
- audit-logger

policy:
content-classifier:
labels:
sensitive:
keywords:
- "internal"
- "confidential"
- "proprietary"
- "classified"
public:
default: true

router:
rules:
- when_label: "sensitive"
target: "perplexity-offline"
- when_label: "public"
target: "perplexity-sonar-pro"

providers:
targets:
- id: "perplexity-sonar-pro"
provider: "perplexity:chat:sonar-pro"
secret_key_ref:
env: "PERPLEXITY_API_KEY"

- id: "perplexity-offline"
provider: "perplexity:chat:r1-1776"
secret_key_ref:
env: "PERPLEXITY_API_KEY"

Citation verification for compliance workflows

Research outputs used in compliance, legal, or regulatory filings require grounded citations. Combine citation-verifier with response auditing so rejected responses are logged:

policy:
citation-verifier:
require_sources: true
require_source_match: true
min_groundedness: 0.85
audit-logger:
immutable: true
retention_days: 365
log_all_access: true
pack:
name: perplexity-example-5
version: 1.0.0
enabled: true
policies:
chain:
- citation-verifier
- audit-logger

Cost controls with model tiering

Perplexity's sonar is 3× cheaper than sonar-pro. Use model tiering to route simple factual queries to sonar and complex multi-document research to sonar-pro or sonar-deep-research:

policy:
cost-guard:
tiers:
low:
max_tokens_per_request: 512
target: perplexity-sonar
medium:
max_tokens_per_request: 2048
target: perplexity-sonar-pro
high:
max_tokens_per_request: 8192
target: perplexity-deep-research
requires_role: researcher
providers:
targets:
- id: perplexity-sonar
provider: perplexity:chat:sonar
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-sonar-pro
provider: perplexity:chat:sonar-pro
secret_key_ref:
env: PERPLEXITY_API_KEY
- id: perplexity-deep-research
provider: perplexity:chat:sonar-deep-research
secret_key_ref:
env: PERPLEXITY_API_KEY
pack:
name: perplexity-example-6
version: 1.0.0
enabled: true
policies:
chain:
- cost-guard

Best Practices

  1. Do not send sensitive data to online modelssonar, sonar-pro, sonar-reasoning, and sonar-deep-research send the query to Perplexity's retrieval infrastructure. Use r1-1776 for internal documents, PII-bearing queries, or any workload governed by a ZDR policy.

  2. Set stream_timeout_seconds per model tiersonar-deep-research can run for 3–5 minutes. Without an adequate timeout, Keeptrusts will terminate the stream prematurely. Set at least 300 seconds for deep-research targets and 60 seconds for sonar-pro.

  3. Enable citation verification for compliance outputs — Research responses used in evidence packages, regulatory filings, or legal briefs should pass through citation-verifier with min_grounded_ratio: 0.85. Log failures for reviewer escalation rather than silently accepting ungrounded answers.

  4. Redact PII before the query reaches the model — Perplexity online models may cite sources that include query terms. A name or email in a prompt could appear in a cited page. Apply pii-detector on the request path to remove identifiers before they become part of the search query.

  5. Use sonar for high-volume applicationssonar costs $1/1M tokens for both input and output — 15× cheaper than sonar-pro on output. For chatbots or assistants that don't require deep citation analysis, sonar is the right default tier.

  6. Log the full response including citations — Perplexity returns source URLs in the message content. Enable include_response: true in audit-logger so your audit trail captures the cited sources, not just the model's answer. This is essential for tracing back AI-generated claims in regulated industries.

For AI systems

  • Canonical terms: Keeptrusts gateway, Perplexity AI, Perplexity, online models, real-time search, citations, provider target, policy-config.yaml, provider: "perplexity".
  • Config field names: provider, model, base_url: "https://api.perplexity.ai", secret_key_ref.env: "PERPLEXITY_API_KEY", format: "openai", pricing.
  • Key behavior: Perplexity returns source URLs in message content — responses include citations that can be verified.
  • Best next pages: Cohere integration (RAG/citations), OpenAI integration, Policy configuration.

For engineers

  • Prerequisites: Perplexity API key (PERPLEXITY_API_KEY env var from perplexity.ai), kt CLI installed.
  • Start command: kt gateway run --listen 0.0.0.0:41002 --policy-config policy-config.yaml.
  • Validate: curl http://localhost:8080/v1/chat/completions -H 'Content-Type: application/json' -d '{"model":"sonar-pro","messages":[{"role":"user","content":"What is the latest news on AI regulation?"}]}'.
  • Enable include_response: true in audit-logger to capture cited source URLs in the audit trail — essential for tracing AI-generated claims.
  • Perplexity uses OpenAI-compatible API — standard OpenAI SDKs work without modification.
  • Online models have variable latency depending on search complexity — set appropriate timeout_seconds.

For leaders

  • Perplexity's citation-backed responses provide verifiable AI outputs — critical for regulated industries where claims must be traceable.
  • Real-time search means responses reflect current information — reduces hallucination risk for time-sensitive queries.
  • Audit trail should capture full responses including citations (include_response: true) for compliance evidence.
  • Online model latency is less predictable than offline models — set timeout and fallback strategies accordingly.

Next steps