Please notice: This article is more than 3 years old
Content, Source code or links may no longer be correct in the meantime.
Please notice: This article is more than 3 years old
Content, Source code or links may no longer be correct in the meantime.
Installing a Nextcloud instance is a snap. There are numerous tutorials and, on better NAS systems or hosting providers, even ready-to-use appliances that can be easily clicked together. When it comes to data backup, my observation is that the sources are rather limited and I would rather not use the few manuals that are available on the net.
This blog post wants to help increase data protection with an automated, daily and complete backup for your Nextcloud. It is aimed at private users. The focus should not just rest on the script, but rather on the thoughts and considerations around it.
There are numerous backup solutions, starting with bare-metal backups of entire servers or partitions, tar, rsync or borg for complete or incremental data backups, up to 7zip for encrypted transfer of data to third parties. This are only a few free solutions. And I don’t even want to mention the many non-free ones like Acronis, Veeam & Co.
My ideal solution of a backup strategy is not just one, but several coordinated tools. For the requirements and data volumes in my Nextcloud instance I have chosen GNU tar. The benefits:
The backup destination is a SMB share on another computer, in my case a FreeNAS. The script assumes that this share is directly accessible. In case you run your Nextcloud in separate network segments, you have to set up this direct connection possibility in firewall and/or routing tables.
Everything is also written from the perspective of Debian. When applying to various derivatives, please keep this in mind. The script is started and afterwards executed by cronjob under the user root. A sudo does not exist by default on a freshly installed Debian server and should therefore be installed alongside with cifs-utls:
# apt install cifs-utils sudo
Furthermore, I assume a working mail server that can send status messages with “sendmail”. To check if this is set up properly use the following one-liner. If everything works, an empty e-mail should end up in your mailbox:
# sendmail "email@domain.tld" < /dev/null
Moreover, I assume that the mySQL/MariaDB database and the Nextcloud /data directory are on the same host. As far as it concerns the /data directory, please also read the section under “Optimizations”.
Most of my data is stored on a FreeNAS and is mounted as “external storage”. This keeps the Nextcloud quite lightweight. In my example we are talking about " just" 31 GB of data to be backed up, which takes about 30 minutes. An acceptable downtime of the Nextcloud, which is put into maintenance mode during full backup and is not usable during this time.
From a bird’s eye view, there are basically two strategies for data backups: Push and pull backups.
With a Push-Backup the backed up host connects to a generally central backup server and copies his data over there. The opposite case is called Pull-Backup. In this case, a central backup server connects to the host to be backed up and pulls the data according to a predefined logic.
Viewed at first glance, a pull backup seems more secure. In practice, the situation is a little bit more complicated. Here are some arguments speaking against a pull-backup strategy:
As usual, it is a trade-off between advantages and disadvantages, and also based on existing circumstances.
The script shown here works with the push strategy, i.e. the Nextcloud mounts a SMB share, copies the data and then unmounts it again.
Except the fact that tar does not offer encryption, it is not desirable to use encryption in our scenario due to other reasons:
Data should only be encrypted, preferably only where there is a transfer or shipment from your own secure environment to an insecure one (e.g. when backups are stored on hard disks at third-party).
The credentials required to mount the SMB network share should be stored in a text file in the root home directory. Unreachable for other users. The name of this file can be chosen freely, so we will name this “.smbcredentials”. The content consists of three text lines only, in which the credentials have to be entered:
username=
password=
domain=
With mysqldump we will also make a full backup of the Nextcloud database and need also the credentials for this. Hence, we need to store the access data in the file “.my.cnf” in the root home directory as well. This filename is predefined and must be copied exactly as it is.
[mysqldump]
user=
password=
In order to mount the SMB network drive, a suitable folder is required. The script expects this below /root. There you simply create a subfolder called “backup”.
# mkdir /root/backup
Once the credentials are stored, the script can be started.
Using a text editor of your choice, a text file named “backupscript” is created in the root home directory and the following content is inserted in it.
The script consists of two parts. In the first one all required parameters are set and must be customized to the individual use case.
The second part contains the main script. You should not change it until you know what you’re doing. For better understanding I have commented each line.
Two important leads: At the end of the script there is a cleanup of any created temporary files. Anyone who needs the logfile can take it out here.
There is also no check of the available disk space on the target drive. I assume that this will be checked and cleaned from time to time.
#!/bin/bash
# Datum Praefix vor den Dateinamen YYYYMMDD
DATE_PREFIX=$(date +%Y%m%d)
# SMB-Netzwerklaufwerk
SMB_DIR="//remotehost/share"
# Mountingpoint vom SMB Netzwerklaufwerkes
MOUNT_DIR="/root/backup"
# SMB Credentials Datei für die SMB Zugangsdaten
SMB_CRED_FILE="/root/.smbcredentials"
# Das Nextcloud Verzeichnis
NEXTCLOUD_DIR="/var/www/cloud.domain.tld"
# Der Datenbankname
NEXTCLOUD_DB=nextcloud
# Name und Ablageort der Statusdatei YYYYMMDD-backuplog.txt
THE_LOG="/root/$DATE_PREFIX-backuplog.txt"
# Name und Ablageort des Datenbank Dumps
THE_DUMP="/root/$DATE_PREFIX-cloud-DB.sql"
# Empfänger der Statusdatei
SEND_REP="email@domain.tld"
# Exclude Dateien oder Ordner
EXCLUDE_PATTERN="--exclude='$NEXTCLOUD_DIR/data/updater*' --exclude='$NEXTCLOUD_DIR/data/*.log'"
# -- Bitte ab hier nichts mehr manuell anpassen --
# Erstellen der Statusdatei und Hinterlegung der Startzeit
echo "Beginne Backup (Startzeit: $(date +%T))..." > $THE_LOG
# SMB Laufwerk mounten
mount -t cifs -o credentials=$SMB_CRED_FILE $SMB_DIR $MOUNT_DIR
# Wartungsmodus der Nextcluod aktivieren
sudo -u www-data php $NEXTCLOUD_DIR/occ maintenance:mode --on
# Erstellen mysql Dump
mysqldump $NEXTCLOUD_DB > $THE_DUMP
# Alles in das Zielverzeichnis mit tar sichern
tar $EXCLUDE_PATTERN -cvpf "$MOUNT_DIR/$DATE_PREFIX-cloud.tar" $NEXTCLOUD_DIR $THE_DUMP
# Wartungsmodus deaktivieren
sudo -u www-data php $NEXTCLOUD_DIR/occ maintenance:mode --off
# Zielverzeichnis prüfen, Stopzeit festhalten
echo "...Backup beendet (Stopzeit: $(date +%T))" >> $THE_LOG
ls -lh $MOUNT_DIR >> $THE_LOG
# Statusdatei senden
sendmail $SEND_REP < $THE_LOG
# Garbagge Collection
rm $THE_LOG $THE_DUMP &
umount $MOUNT_DIR
Once the file is created, it must be made executable:
# chmod +x /root/backupscript
That’s it! The script can now be started manually or automated e.g. with crontab at night or during weekends. The only requirement: It must be run in root context.
Depending on the number of users and the amount of data, a backup may take its time. Since the script intentionally makes complete backups, it contains some small but nice optimizations:
The first optimization option is to skip the usually very large log files. The updater directory with the backups of previous versions can also be left out. For this reason the EXCLUDE_PATTERN line exists:
EXCLUDE_PATTERN="--exclude='$NEXTCLOUD_DATA_DIR/updater*' --exclude='$NEXTCLOUD_DATA_DIR/*.log'"
Another boost is the trashbin cleanup prior the backup. I did not include it in the script. But if you want to, you may add the following line between switching the maintenance mode on and creating the database dump:
sudo -u www-data php $NEXTCLOUD_DIR/occ trashbin:cleanup --all-users
The script backs up without compression to keep the downtime as small as possible. If you don’t care about this and if you prefer small, compressed tarballs, you can activate the -j option (for bzip2) and of course add the file extension:
tar $EXCLUDE_PATTERN -cvjpf "$MOUNT_DIR/$DATE_PREFIX-cloud.tar.bz2" $NEXTCLOUD_DIR $THE_DUMP
In my case compression is not economical, since the benefit is only 5% (2 GB) of additional disk space but at the cost of 4x longer downtime (2h instead of 30 min). Your milage may vary and of course there are other compression methods besides bzip with different compression ratios.
The biggest benefit gains, not only in terms of security but also in terms of performance, is achieved by the correct location of the /data directory. This should be outside the Nextcloud folder if possible. Ideally on a different hard disk.
The script assumes the Nextcloud installation defaults without a relocated /data directory. If you have put it somewhere else, then you should add an additional variable NEXTCLOUD_DATA_DIR:
# Das Nextcloud /data Verzeichnis
NEXTCLOUD_DATA_DIR="/var/data"
The exception list must be modified too:
# Exclude Dateien oder Ordner
EXCLUDE_PATTERN="--exclude='$NEXTCLOUD_DATA_DIR/updater*' --exclude='$NEXTCLOUD_DATA_DIR/*.log'"
Finally, the line with the main tar command should look like this:
# Alles in das Zielverzeichnis mit tar sichern
tar $EXCLUDE_PATTERN -cvpf "$MOUNT_DIR/$DATE_PREFIX-cloud.tar" "$NEXTCLOUD_DIR" "$NEXTCLOUD_DATA_DIR" $THE_DUMP
What’s the difference between
mysqldump $NEXTCLOUD_DB > $THE_DUMP
and
mysqldump $NEXTCLOUD_DB > $THE_DUMP &
Well normally, instructions in bash scripts are processed sequentially. The ampersand “&” at the end of the line means that the command is processed in the background and thus parallel to the next line(s) in the script. Depending on the size of the database, you may save a few minutes of runtime.
But this can be done when it’s sure that the backup time of the database dump is finihed before tar tries to put it into the tarball. Under normal circumstances this should be the case, but I recommend to check this out in advance.
There is no mercy nor excuse for a lack of responsibility. Especially not if you share a Nextcloud with others. Even when talking about a private “family cloud”, it usually represents an important part of personal life, with all the pictures, conversations and shared data pool ranging from important documents to shared memories.
This blog post proves: A complete and automated backup is no magic.
Have fun
Your Tomas Jakobs