Shell Command Risk Classification

When a task on a hosted gateway requests to execute a shell command, Keeptrusts classifies the command into one of four risk levels before deciding whether to proceed, require approval, or deny execution outright.

Use this page when

You need to understand how the gateway classifies shell commands into risk categories (safe, moderate, destructive).
You are tuning classification rules to reduce false positives for your workflow.
You want to add custom classification patterns for domain-specific tooling.

Primary audience

Primary: Technical Engineers
Secondary: AI Agents, Technical Leaders

Risk Levels

Safe

Safe commands read state without modification. They execute immediately without approval.

Examples:

git status, git log, git diff
ls, find, cat, head, tail, wc
echo, date, whoami, pwd
cargo check, npm run typecheck
grep, rg, ag (search tools)

Safe commands produce no side effects on the file system, network, or system state.

Moderate

Moderate commands may modify local state but the changes are recoverable or scoped to the task workspace.

Examples:

git add, git commit (local only)
mkdir, touch, cp (within allowed roots)
npm install (within project directory)
cargo build, make

Moderate commands may or may not require approval depending on your approval_required_risk_levels configuration. By default, moderate commands execute without approval.

Destructive

Destructive commands delete data, overwrite files, change permissions, mutate network state, or interact with credentials. They require approval by default.

Categories of destructive commands:

Category	Examples
Delete operations	`rm`, `rm -rf`, `rmdir`, `shred`
Overwrite operations	`mv` (overwriting target), `dd`, `truncate`
Permission changes	`chmod`, `chown`, `chgrp`, `setfacl`
Package installation	`apt install`, `brew install`, `pip install -g`
Network mutation	`iptables`, `ufw`, `curl -X POST/PUT/DELETE`
Credential operations	`ssh-keygen`, `gpg --delete-key`, `passwd`
Database mutation	`psql -c "DROP"`, `mysql -e "DELETE"`, `redis-cli FLUSHALL`
Process control	`kill`, `killall`, `pkill`, `systemctl stop`

Critical

Critical commands affect system-level resources, can compromise host integrity, or have irreversible consequences at scale.

Examples:

rm -rf / or broad recursive deletions
mkfs, fdisk, mount (filesystem operations)
reboot, shutdown, init
docker rm, docker system prune
iptables -F (flush all firewall rules)
Anything running as root or with sudo

Critical commands always require approval regardless of configuration.

Classification Process

When a task requests command execution, the gateway classifies the command through this sequence:

Parse command → extract binary name and arguments
Check blocked_commands → DENY if matched (no approval possible)
Check allowed_commands → DENY if not matched (no approval possible)
Classify risk level → safe / moderate / destructive / critical
Check approval_required_risk_levels → APPROVE or PAUSE for approval

Step 1: Parse Command

The gateway parses the command string to extract the executable name, arguments, flags, and any pipe chains. Each segment of a piped command is classified independently, and the overall command inherits the highest risk level.

Step 2: Check Blocked Commands

If the command matches any entry in blocked_commands, execution is denied immediately. No approval flow is triggered — the command simply cannot run.

Step 3: Check Allowed Commands

If allowed_commands is configured and the command does not match any entry, execution is denied immediately. The allowed list acts as a positive-security allowlist.

Step 4: Classify Risk Level

The gateway analyzes the command against built-in classification rules. Classification considers:

The executable name and its known behavior
Flags and arguments that modify behavior (e.g., -f for force, -r for recursive)
Target paths (system paths increase risk)
Piped destinations (piping to rm elevates risk)
Compound commands (;, &&, || chains inherit the highest risk)

Step 5: Check Approval Requirements

The gateway checks whether the classified risk level appears in approval_required_risk_levels. By default, destructive and critical require approval. You can customize this:

hosted_gateway:
  shell:
    approval_required_risk_levels:
      - destructive
      - critical
      # Add "moderate" here if you want approval for moderate commands too

Classification Examples

Command	Risk Level	Reason
`git status`	safe	Read-only state query
`ls -la /srv/project`	safe	Directory listing
`cat README.md`	safe	File read
`npm run build`	moderate	Modifies local build output
`git commit -m "fix"`	moderate	Local repository mutation
`rm -rf node_modules`	destructive	Recursive deletion
`chmod 777 deploy.sh`	destructive	Permission change
`pip install requests`	destructive	System package installation
`curl -X DELETE https://api.example.com/resource`	destructive	Network mutation
`docker system prune -af`	critical	Irreversible system cleanup
`sudo rm -rf /var/log`	critical	Privileged recursive deletion

Compound Commands

When a command uses pipes, semicolons, or logical operators, each segment is classified independently:

# Overall: destructive (highest of safe + destructive)
git log --oneline | head -5 && rm -rf tmp/

# Overall: safe (all segments are safe)
grep -r "TODO" src/ | wc -l

# Overall: critical (sudo elevates to critical)
npm run build && sudo systemctl restart app

Customizing Classification

You influence classification through configuration rather than modifying built-in rules:

Allowed commands — only permit commands you expect tasks to need
Blocked commands — explicitly deny commands that should never run
Approval levels — choose which risk levels require human approval
Command timeout — limit how long any command can run

The built-in classification rules are not user-editable. They represent a security baseline that your allow/deny lists and approval policies build upon.

What Happens After Classification

Safe/Moderate (no approval required): Command executes immediately. Result is returned to the task. Audit event is emitted.
Destructive/Critical (approval required): Task pauses. A pending action is created. An approval card appears in chat. The command waits for explicit user approval before executing.
Denied (blocked or not allowed): Command never executes. Task receives a denial response. Audit event records the denial with the reason.

Security Considerations

Classification runs on the gateway, not in the task or agent context
The gateway never executes a command before classification completes
Blocked commands cannot be circumvented through approval — they are unconditionally denied
Command arguments are analyzed, not just the binary name — rm file.txt and rm -rf / receive different classifications
Shell expansion and variable substitution are evaluated before classification where possible

For AI systems

Canonical terms: Keeptrusts, shell command risk classification, safe, moderate, destructive, critical, risk levels, pipe chain analysis, hosted gateway.
Exact feature/config names: risk levels (safe, moderate, destructive, critical), hosted_gateway.shell.approval_required_risk_levels, allowed_commands, blocked_commands, command_timeout_seconds.
Best next pages: Shell Allow and Deny Lists, Destructive Action Approval, Pending Action Lifecycle.

For engineers

Four risk levels: safe (read-only, executes immediately), moderate (recoverable local changes), destructive (deletes/overwrites/permissions), critical (system-level, irreversible).
Classification order: parse command → check blocked_commands → check allowed_commands → classify risk → check approval requirements.
Piped commands inherit the highest risk level across all pipe segments; compound commands (&&, ||, ;) inherit highest risk.
Arguments and flags are analyzed — rm file.txt differs from rm -rf /; -f (force) and -r (recursive) elevate risk.
You cannot modify built-in classification rules — use allow/deny lists and approval levels to tune behavior.

For leaders

Risk classification runs on the gateway before execution — agents cannot bypass or override the classification engine.
The four-level model provides granular control: allow safe commands to run unimpeded while ensuring destructive/critical operations require human review.
Critical commands (system-level, sudo, filesystem formatting) always require approval regardless of configuration — this is a non-overridable safety baseline.
Combined with allow/deny lists, classification provides defense in depth: blocked commands never reach classification, and classified commands still require appropriate approval.

Next steps

Shell Allow and Deny Lists — pre-filter commands before classification
Destructive Action Approval — the approval flow for classified destructive/critical commands
Pending Action Lifecycle — what happens after a command pauses for approval

Use this page when​

Primary audience​

Risk Levels​

Safe​

Moderate​

Destructive​

Critical​

Classification Process​

Step 1: Parse Command​

Step 2: Check Blocked Commands​

Step 3: Check Allowed Commands​

Step 4: Classify Risk Level​

Step 5: Check Approval Requirements​

Classification Examples​

Compound Commands​

Customizing Classification​

What Happens After Classification​

Security Considerations​

For AI systems​

For engineers​

For leaders​

Next steps​

Use this page when

Primary audience

Risk Levels

Safe

Moderate

Destructive

Critical

Classification Process

Step 1: Parse Command

Step 2: Check Blocked Commands

Step 3: Check Allowed Commands

Step 4: Classify Risk Level

Step 5: Check Approval Requirements

Classification Examples

Compound Commands

Customizing Classification

What Happens After Classification

Security Considerations

For AI systems

For engineers

For leaders

Next steps