Image-based vs. File-based Backup
Michael Bose, posted on May 15, 2017
Image-based and file-based backup pursue different goals, and although they can be interchangeable, it is better to understand the purpose and the meaning of each of them.
When it comes to backing up separate files, the first thing that comes to mind is popular backup services like Dropbox, Google Drive, Box, OneDrive, or iCloud. People and organizations across the globe use these solutions to keep, synchronize, and share files through the reliable infrastructures. They are fast and easy-to-use: you just need to put a file into the specific folder, and voila, it is accessible from any other device which is authorized to access your account. These are the examples of file-based backups.
This approach works well when you need to store relatively small amounts of data, which may be platform-independent: for instance, you can open photos shared via Dropbox on any computer with a GUI-based operating system. However, you usually need a specific software called an ‘agent’ on a source computer to perform such backup, and the other copy of an agent on the target host (regardless of if it is a home PC or a server somewhere in a cloud).
Thus, a file-based backup operates on a level of a file system, determining files and folders and processing them for the backup. When you need to back up a complete system, such as a whole operating system or a virtual machine, the file-based backup might not be enough. The operating system, regardless of if it is installed on a home PC or a virtual machine, is more complex than just a set of files: it has lots of operations running, like data stored in a RAM or input and output operations.
Technically, it is possible to back up the operating system using the file-based backup because the configuration of the OS stored in files, but while you are copying the content of one file, the other one might be changed. Thus, you will not be able to get the complete picture of a system. And here is an image-based backup to help.
Before a backup per se starts, a backup application makes an image or snapshot of the operating system. For a fraction of a second, all processes on the operations system freeze, its state is captured, and then it goes on. It is made so quickly, that you don’t even notice. After the backup application gets such an image, it can send it to a backup repository while the source system continues running.
Currently, most of the image-based solutions are agentless, though the legacy ones use agents. Most operating systems have a built-in mechanism to take snapshots, for example, the Volume Shadow Copy Service or VSS in Microsoft Windows system. So, the backup application can use the snapshots made by the inner service without affecting a source machine. This approach offloads the source machine, as the backup application can be installed on a dedicated server.
An image-based backup can provide the file-based features as well. For example, NAKIVO Backup & Replication allows recovering separate files, Exchange and Active Directory objects from a backup repository.
At the same time, the file-based backup cannot replace the image-based in full. Even if all of the files from a source VM will be copied, the backup will not be crash-consistent.
The file-based backup is reasonable when it is necessary to copy separate files so that they can be recovered to any other system. The image-based backup works the best if it is necessary to back up a VMware or Hyper-V virtual machine.