Backup Operations (206.2)

The candidate should be able to create an off-site backup storage plan.

References: LinuxRef07, Wirzenius98.

Why?

Everyone (more or less) knows that, as a system administrator, it is vital to make backups. Most also know why. Your data is valuable. It will cost you time and effort to re-create it, and that costs money or at least personal grief and tears. Sometimes it can't even be re-created, such as the results of some experiments. Since it is an investment, you should protect it and take steps to avoid losing it.

There are four main reasons why you may lose data: human error, software bugs, hardware failure and natural disasters. Humans are quite unreliable, they might make a mistake or be malicious and destroy data on purpose. Modern software does not even pretend to be reliable. A rock-solid program is an exception, not a rule. Hardware is more reliable, but may break seemingly spontaneously, often at the worst possible time. Nature may not be evil, but, nevertheless, can be very destructive sometimes.

What?

In general, you want to back up as much as possible. A major exception is the /proc filesystem. Since this only contains data that the kernel generates automagically, it is never a good idea to back it up. The /proc/kcore file is especially unnecessary, since it is just an image of your current physical memory; it's pretty large as well. Some special files that are constantly changed by the operating system (e.g. /etc/mtab) should not be restored, hence not be backed up. There may be others on your system.

Gray areas include the news spool, log files and many other things in /var. You must decide what you consider important. Also, consider what to do with the device files in /dev. Most backup solutions can backup and restore these special files, but you may want to re-generate them with a script.

The obvious things to back up are user files (/home) and system configuration files (/etc, but possibly other things scattered all over the filesystem).

When?

Depending on the rate of change of the data, this may be anything from daily to almost never. The latter happens on firewall systems, where only the log files change daily (but logging should happen on a different system anyway). So, the only time when a backup is necessary, for example, is when the system is updated or after a security update.

On normal systems, a daily backup is often best.

Backups can take several hours to complete, but, for a successful backup strategy, human interaction should be minimal, preferably just a matter of placing a tape in the tape device.

How?

While the brand or the technology of the hardware and software used for backups is not important, there are, nevertheless, important considerations in selecting them. Imagine, for example, the restore software breaks and the publisher has since gone out of business.

No matter how you create your backups, the two most important parts in a backup strategy are:

verifying the backup

The safest method is to read back the entire backup and compare this with the original files. This is very time-consuming and often not an option. A faster, and relatively safe method, is to create a table of contents (which should contain a checksum per file) during the backup. Afterwards, read the contents of the tape and compare the two.

testing the restore procedure

This means that you must have a restore procedure. This restore procedure has to specify how to restore anything from a single file to the whole system. Every few months, you should test this procedure by doing a restore.

Where?

If something fails during a backup, the medium will not contain anything useful. If this was your only medium, you are screwed. So you should have at least two sets of backup media. But if you store both sets in the same building, the first disaster that comes along will destroy all your precious backups along with the running system.

So you should have at least one set stored at a remote site. Depending on the nature of your data, you could store weekly or daily sets remotely.

Do not forget to store a copy of the backup-plan along with the backups at the remote site. Otherwise you cannot guarantee that the system will be back up-and-running in the shortest possible time.

Copyright Snow B.V. The Netherlands