13 KiB
zuzu-system-backup
Python-based backup tool for Linux hosts that pushes snapshots and live mirrors to a remote NAS over SSH/rsync.
- Snapshots are compressed archives (tar + optional zstd) created on the host, then uploaded.
- Mirrors are “single-copy” rsync trees kept in sync with
--delete. - Configuration is per-host, in YAML.
- Scheduling is done via systemd timers.
All hostnames, users, and paths in this README use fictional example values so it’s safe to publish as-is. Replace them with your own.
1. High-level architecture
1.1 Components
- Host script:
zuzu-system-backup.py- Runs on each Linux host.
- Reads per-host
backup.yaml. - Builds local temp trees, compresses them, and uploads archives.
- Manages retention (max snapshot count).
- Remote backup share (NAS / backup server):
- Exposed via SSH (and optionally a mounted share).
- Receives snapshot archives and sync mirrors.
1.2 Snapshot vs mirror
The script supports two complementary backup modes:
-
Snapshots (versioned)
- Controlled by:
system.include_pathsuser.include_dirsuser.include_files
- On each run:
- rsync selected paths into a local temp tree per category.
- compress each category into a single
*.taror*.tar.zstarchive. - upload archives to a timestamped directory on the backup share.
- Controlled by:
-
Single-copy mirrors (no history)
- Controlled by
single_copy_mappings. - Each mapping is
"LOCAL_SRC|REMOTE_DEST". - rsync with
--deletekeeps the remote tree in sync with local. - Intended for large data (databases, model stores, etc.).
- Controlled by
2. Data-flow diagrams
2.1 Backup flow on a single host
flowchart LR
subgraph Host["Linux host (example: orion.example.net)"]
A[Snapshot sources\nsystem + user] --> B[Local temp trees\nsystem / user-dirs / user-files]
B --> C[Tar plus optional zstd\ncreate archives]
D[Mirror sources\nsingle_copy_mappings] --> E[rsync --delete]
end
subgraph NAS["Backup share (example: backup-nas.local)"]
C --> F[Snapshots directory\n/daily/system-orion/<timestamp>/]
E --> G[Sync mirrors\n/sync/system-orion-*]
end
- Snapshots: archives are compressed locally and only the archives are sent.
- Mirrors: rsync goes directly from host paths to remote paths, with
--delete.
2.2 Control flow inside the script
flowchart TD
Start([systemd timer or manual run])
--> LCFG[Load backup.yaml]
--> PREP[Compute paths and compression root]
--> SNAPROOT[Ensure remote snapshots root exists]
--> SNAPNAME[Generate snapshot name\nYYYY-MM-DD_HH-MM-SS]
--> MKDIR_REMOTE[Create remote snapshot directory]
MKDIR_REMOTE --> TMPDIR[Create local temp root]
TMPDIR --> RSYNC_SYS[rsync system.include_paths\ninto temp/system]
TMPDIR --> RSYNC_UDIR[rsync user.include_dirs\ninto temp/user-dirs]
TMPDIR --> RSYNC_UFILE[rsync user.include_files\ninto temp/user-files]
RSYNC_SYS --> COMP_SYS[Compress system category\ncreate system archive]
RSYNC_UDIR --> COMP_UDIR[Compress user-dirs category\ncreate user-dirs archive]
RSYNC_UFILE --> COMP_UFILE[Compress user-files category\ncreate user-files archive]
COMP_SYS --> UPLOAD_SYS[Upload archives to remote snapshot directory]
COMP_UDIR --> UPLOAD_UDIR
COMP_UFILE --> UPLOAD_UFILE
UPLOAD_UFILE --> MIRRORS[Process mirrors\nsingle_copy_mappings with rsync --delete]
MIRRORS --> PRUNE[Prune old snapshots\nkeep at most N]
PRUNE --> CLEANUP[Remove temp tree and excludes file]
CLEANUP --> Done([Exit])
3. Backup share layout
This project assumes a structured backup share on the NAS or backup server, for example:
/srv/backup/
automated/
daily/
system-orion/
2025-01-01_03-00-01/
system.tar.zst
user-dirs.tar.zst
user-files.tar.zst
2025-01-02_03-00-01/
...
system-pegasus/
system-hera/
sync/
system-orion-databases/
... live rsync mirror ...
system-orion-models/
...
system-pegasus-luns/
...
manual/
installer-isos/
http-bookmarks/
license-keys/
...
Typical patterns:
- automated/daily/system-/ Date-based snapshot directories, each containing only a few archives.
- automated/sync/system--/ One dir per mirror, updated in place.
- manual/ Hand-managed backups, not touched by this tool.
You can adapt the root (/srv/backup) and naming (system-orion, etc.) to match your environment.
4. Features
-
Python 3 script, no shell gymnastics.
-
YAML configuration, per-host.
-
Snapshot categories:
system.include_paths(root-owned config and service dirs)user.include_dirs(full user trees, relative touser.home)user.include_files(one-off important files)
-
Compression modes:
high(zstd-19)light(zstd-3)none(plain.tar)
-
Local compression only:
- No dependency on the NAS having zstd or GNU tar.
-
Single-copy mirrors:
- Declarative
"LOCAL_SRC|REMOTE_DEST"mappings.
- Declarative
-
Retention:
- Keep at most
retention.snapshotssnapshot directories per host.
- Keep at most
-
Systemd integration:
- Ones-shot service + timer per host.
-
Logging:
- Structured, timestamped logs via
journalctl.
- Structured, timestamped logs via
5. Dependencies
On each host:
python(3.x)python-yaml(PyYAML)rsyncopensshtarzstd(optional but strongly recommended if usingcompression.mode: high|light)
Example (Arch-based host):
sudo pacman -S python python-yaml rsync openssh tar zstd
6. Configuration (backup.yaml)
Each host has its own backup.yaml next to the script.
6.1 Schema overview
remote: where to send backupsretention: how many snapshots to keepcompression: how to compress snapshot archivesrsync: extra rsync flagssystem: system-level include pathsuser: user home and per-user include paths/filesexclude_patterns: rsync-style excludessingle_copy_mappings: one-way mirrors (no history)
6.2 Example backup.yaml (for host orion)
remote:
user: backupuser
host: backup-nas.local
port: 22
key: /home/backupuser/.ssh/id_ed25519-orion
base: /srv/backup/automated
host_dir: system-orion
retention:
# Max number of snapshot directories to keep on NAS
snapshots: 7
compression:
# high | light | none
mode: high
# Optional: where local temp trees and archives live
path: /srv/tmp/backups
rsync:
extra_opts:
- --numeric-ids
- --info=progress2
- --protect-args
system:
include_paths:
- /etc/nftables.conf
- /etc/snapper/configs
- /etc/NetworkManager/system-connections
- /etc/chromium/policies/managed
- /etc/fstab
- /etc/systemd/system/*.mount
- /etc/systemd/system/*.automount
- /etc/nut/nut.conf
- /etc/nut/upsmon.conf
user:
home: /home/devuser
include_dirs:
- .ssh
- .gnupg
- .local/share/wallpapers
- projects
- pkgbuilds
- venvs
include_files:
- .config/chromium/Default/Preferences
- .config/chromium/Default/Bookmarks
- .config/vlc/vlcrc
- .gitconfig
- .bashrc
- .bash_profile
- .local/share/user-places.xbel
exclude_patterns:
# Caches (generic)
- "**/Cache/**"
- "**/GPUCache/**"
- "**/shadercache/**"
- "**/ShaderCache/**"
- "**/Code Cache/**"
# SSH ControlMaster sockets
- "${USER_HOME}/.ssh/ctl-*"
- "**/.ssh/ctl-*"
# JetBrains bulk (plugins + Toolbox app bundles)
- "${USER_HOME}/.local/share/JetBrains/**/plugins/**"
- "${USER_HOME}/.local/share/JetBrains/Toolbox/apps/**"
- "${USER_HOME}/.cache/JetBrains/**"
# Chromium bulk (we include only specific files above)
- "${USER_HOME}/.config/chromium/**"
single_copy_mappings:
# Example mirrors:
- "/srv/data/postgres|/srv/backup/automated/sync/system-orion-postgres"
- "/srv/data/models|/srv/backup/automated/sync/system-orion-models"
Notes:
user.include_dirsanduser.include_filesare relative touser.homeunless they start with/.${USER_HOME}and${HOME}inexclude_patternsare expanded touser.homeby the script.single_copy_mappingspaths are not expanded; use absolute paths.
7. Script usage
7.1 Manual run
From the directory where the script lives, or via its full path:
sudo /usr/local/sbin/zuzu-system-backup/zuzu-system-backup.py
(or whatever path you install it to)
Logs go to stderr; under systemd, they land in journalctl.
7.2 Systemd service & timer
Example service:
# /etc/systemd/system/host-backup.service
[Unit]
Description=Host backup to NAS via zuzu-system-backup
Wants=network-online.target
After=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/local/sbin/zuzu-system-backup/zuzu-system-backup.py
Nice=10
IOSchedulingClass=best-effort
IOSchedulingPriority=7
Example timer:
# /etc/systemd/system/host-backup.timer
[Unit]
Description=Nightly host backup to NAS
[Timer]
OnCalendar=*-*-* 03:15:00
RandomizedDelaySec=20min
Persistent=true
[Install]
WantedBy=timers.target
Enable and start:
sudo systemctl daemon-reload
sudo systemctl enable --now host-backup.timer
Check status:
systemctl list-timers 'host-backup*'
journalctl -u host-backup.service -n 50
8. Retention policy
Retention is implemented as “keep at most N snapshots” for each host:
retention.snapshots: 7→ keep the newest 7 snapshot directories underREMOTE_BASE/host_dir/snapshots.- Snapshot directory names are timestamps:
YYYY-MM-DD_HH-MM-SS. - Older snapshot dirs are deleted entirely (
rm -rfon the NAS via SSH).
No time math on mtime, just count-based retention by sorted timestamp name; simple and predictable.
9. Restore basics
9.1 Restoring from a snapshot archive
On the NAS or after copying archives locally:
# Example: restore system snapshot for 2025-01-02 from host "orion"
cd /restore/target
# If compressed
zstd -d /srv/backup/automated/daily/system-orion/2025-01-02_03-00-01/system.tar.zst -o system.tar
tar -xf system.tar
# If compression.mode was "none"
tar -xf /srv/backup/automated/daily/system-orion/2025-01-02_03-00-01/system.tar
Repeat for user-dirs.tar(.zst) and user-files.tar(.zst) as needed.
9.2 Restoring from a mirror
Mirrors are just rsynced trees; you can restore them with rsync or cp:
# rsync mirror back to host
rsync -aHAX --numeric-ids \
backupuser@backup-nas.local:/srv/backup/automated/sync/system-orion-postgres/ \
/srv/data/postgres/
Always test restores on a non-production target first.
10. Safety notes
-
Mirrors are destructive:
single_copy_mappingsuse rsync--delete.- Deletes on the host will remove files on the backup side in the next run.
-
Snapshots are immutable per run:
- Each run creates a new directory, writes archives, and then retention may remove older snapshot dirs.
-
Local compression uses space:
compression.pathshould point at a filesystem with enough free space to hold a full snapshot’s uncompressed temp trees plus the compressed archives.
-
Permissions:
- The script expects to be run as root (or with enough privileges) to read system paths and user homes.
-
SSH keys:
- Use dedicated SSH keys per host with restricted accounts on the NAS where possible.
10. Contributing
Contributions are very welcome, especially around:
- additional backup backends or layout conventions,
- smarter snapshot/mirror strategies (e.g., per-path compression settings),
- restore helpers and verification tooling,
- better safety guards around destructive operations (
--delete, pruning), - distro packaging (Arch, Debian, containers, etc.).
Basic guidelines:
- Treat this as infrastructure code:
- avoid surprises in defaults (compression, retention, paths),
- keep the YAML schema stable and well-documented,
- make new features opt-in whenever they could delete or overwrite data.
- Be conservative with
rsync --delete:- mirrors are intentionally destructive, but code paths that trigger deletion should be obvious and well-commented.
- Keep logs readable and actionable:
- clear “what is happening” messages,
- explicit summary per run (snapshot name, mirrors processed, pruning done).
Bug reports and pull requests are preferred over vibes and interpretive dance.
11. License
This project is licensed under the MIT License.
See the LICENSE file in this repository for the full text.
12. Author & acknowledgements
Author / Maintainer
- Name: Peter Knauer (zuzu@quantweave.ca)
Acknowledgements
- Inspired by earlier shell-based backup scripts and ad-hoc rsync one-liners that deserved a nicer life.
- Thanks to everyone who runs this on real systems, weird filesystems, and “creative” NAS setups and reports back what explodes.
The goal of this project is to make Linux backups boring, predictable, and inspectable—for both humans and tools trying to reason about how and where data is stored.