Rollback Procedures¶
IEC 62443-4-1 SUM-4 requires documented rollback procedures for all deployment artifacts.
Kubernetes Deployments (Helm)¶
Chart Rollback¶
# List release history
helm history rune -n rune
# Rollback to previous revision
helm rollback rune <revision> -n rune
# Verify rollback
kubectl get pods -n rune -w
helm status rune -n rune
Operator Rollback¶
helm rollback rune-operator <revision> -n rune-system
kubectl get pods -n rune-system -w
UI Rollback¶
helm rollback rune-ui <revision> -n rune
Container Image Rollback¶
If a new image introduces a regression:
# Pin to previous known-good tag
helm upgrade rune rune/rune \
--set image.tag=v0.0.0a4 \
-n rune
# Verify pods are running the correct image
kubectl get pods -n rune -o jsonpath='{.items[*].spec.containers[*].image}'
Database Schema Rollback¶
RUNE uses SQLite for job persistence. The database file is stored in a persistent volume.
Pre-upgrade Backup¶
Before any upgrade that modifies the SQLite schema:
# Backup the database
kubectl exec -n rune deploy/rune-api -- cp /data/jobs.db /data/jobs.db.bak
# Verify backup
kubectl exec -n rune deploy/rune-api -- ls -la /data/jobs.db.bak
Restore from Backup¶
kubectl exec -n rune deploy/rune-api -- cp /data/jobs.db.bak /data/jobs.db
kubectl rollout restart deploy/rune-api -n rune
Release Rollback (PyPI)¶
PyPI does not support true rollback — only yanking. If a bad release is published:
- Yank the bad version:
twine yank rune-bench==<bad-version> - Users can pin to the previous version:
pip install rune-bench==<good-version>
Air-Gapped Environment Rollback¶
See rune-airgapped upgrade/rollback scripts (Issue #13).
For air-gapped clusters:
- Keep the previous OCI bundle tarball on the bootstrap node.
- Re-run
bootstrap.shwith the previous bundle. - Helm releases are rolled back automatically when the previous chart versions are applied.
Emergency Shutdown¶
If a rollback is insufficient and the system must be stopped immediately:
# Suspend all scheduled benchmarks
kubectl patch runebenchmark --all -n rune --type merge -p '{"spec":{"suspend":true}}'
# Scale down all RUNE workloads
kubectl scale deploy --all -n rune --replicas=0
kubectl scale deploy --all -n rune-system --replicas=0
Verification After Rollback¶
After any rollback, run the health check script:
./scripts/health-check.sh --namespace rune
Confirm:
- All pods are Running with zero recent restarts
- /healthz endpoints return 200 OK
- Job history is intact (no data loss)
- Scheduled benchmarks resume correctly (if unsuspended)