Hot-Reloading Policy Configs Without Downtime
Keeptrusts hot-reloads policy configs without downtime by letting a managed gateway load a new validated config while in-flight requests finish under the old one. The operational path is explicit: install the managed gateway, start it, bump the config version, lint it, reload it with kt gateway reload, and revert if activation fails or the new policy behaves badly.
Use this page when
- You need to change policy behavior or provider settings on a live gateway without restarting the process.
- You want the documented reload and revert workflow, not an informal “edit the YAML and hope” process.
- You need a safer alternative to maintenance windows for routine policy changes.
Primary audience
- Primary: Technical Engineers
- Secondary: DevOps and SRE teams
The problem
Configuration changes are operationally awkward when the only deployment tool is a restart. Even a brief restart creates a visible interruption, and the interruption is least welcome when the change is urgent.
Policy changes also deserve better history than “someone edited a file on the server.” When you are tuning prompt-injection behavior, adding PII redaction, or tightening provider-routing rules, you need a clean answer to three questions: what changed, when did it activate, and how do we go back?
Without hot reload, those questions get mixed together. Process restarts become change deployment, activation verification, and rollback all at once. That is harder to reason about and harder to automate safely.
The solution
Keeptrusts documents a managed gateway workflow that separates those steps.
The managed instance keeps track of desired and observed config versions in local supervisor state. A reload asks the running gateway to load the new config over its admin endpoint. Requests already in flight continue on the old config. New requests switch only after activation succeeds. If activation fails, the previous known-good config remains available for revert.
That model changes the operational question from “how fast can we restart?” to “is the next config valid, observable, and reversible?” That is a much better question for production systems.
Implementation
Start with a baseline managed gateway config and a versioned pack block:
pack:
name: local-dev
version: 0.1.0
enabled: true
providers:
targets:
- id: openai-primary
provider: openai
model: gpt-5.4-mini-mini
base_url: https://api.openai.com
secret_key_ref:
env: OPENAI_API_KEY
policies:
chain:
- prompt-injection
- audit-logger
policy:
prompt-injection:
response:
action: block
message: Request blocked: potential prompt injection detected
audit-logger: {}
Install and start the managed instance:
kt gateway install \
--name local-dev \
--listen 0.0.0.0:41002 \
--policy-config policy-config.yaml
kt gateway start --name local-dev
kt gateway status --name local-dev
When you need a change, edit the config and bump pack.version. For example, add pii-detector and move from 0.1.0 to 0.1.1. Then validate first and reload second:
kt policy lint --file policy-config.yaml
kt gateway reload \
--name local-dev \
--gateway-url http://localhost:41002 \
--config-path policy-config.yaml
Two details from the docs are easy to miss and worth enforcing in review.
The first is that --config-path must stay relative to the gateway working directory. Absolute paths and paths containing .. are rejected.
The second is that versioning matters. If you do not bump pack.version, the reload may succeed technically, but history becomes much harder to reason about.
After the reload, inspect the recorded operation history:
kt gateway status --name local-dev --history-json --history-action reload
If the new config misbehaves or activation fails, the documented rollback path is straightforward:
kt gateway revert \
--name local-dev \
--gateway-url http://localhost:41002
That is the practical zero-downtime loop: lint, reload, observe, revert if necessary. It is also a better review model because the change is now explicit and versioned instead of being hidden inside a process restart.
This is not a substitute for rollout discipline. You should still follow the guidance in Managing Policy Changes and Policy Testing in CI/CD. Hot reload removes the maintenance window problem. It does not remove the need to validate the rule itself.
Results and impact
The first result is fewer operational interruptions. Routine policy changes no longer need a full process bounce just to get a new config live.
The second result is better auditability of changes. Reload and revert history make it possible to answer when a config changed and what version was active during an incident window.
The third result is faster recovery when a change goes wrong. Revert becomes an explicit operation rather than an improvised rollback.
There is also a process benefit: once teams adopt lint-gated reloads, policy changes start looking more like controlled releases and less like emergency edits. That usually improves both change quality and reviewer confidence.
Key takeaways
- Use
kt gateway reload, not restarts, for managed gateway config changes. - Always lint before reload and bump
pack.versionfor readable history. - In-flight requests continue under the old config while the new one activates.
kt gateway status --history-jsongives you an audit trail of reload and revert operations.- Hot reload reduces downtime, but it still needs normal rollout discipline.