June 9, 2017
VM Backup vs. VM Replication
VM backup and replication are essential parts of a data protection plan. At the first look, they seem similar and interchangeable, and this is partly true. VM backup and replication pursue different goals, so you need to think of them both when setting up a data protection plan and understand the difference between them.
VM Backup and Replication at a Glance
Fundamentally, both backup and replication are necessary to keep a source virtual machine’s data so you can restore it on demand. However, VM backup and replication have different objectives. VM backups are intended to store your data for as long as deemed necessary, so you can go back in time and restore what was lost, while VM replicas (the result of replication) are intended to restore the VMs as soon as possible, hence the differences in the technologies used to proceed with both.
As the main objective of backups is long-term data storage, various data reduction techniques are used by backup software to reduce the backup size and fit the data into the smallest amount of disk space possible. This includes skipping unnecessary swap data, data compression, and data deduplication (which removes the duplicate blocks of data and replaces them with references to the existing ones).
Because VM backups are compressed and deduplicated to save storage space, they no longer look like VMs and are stored in a special format that a backup software can understand. And because VM backup is just a set of files, the backup repository is just a folder, which can be located anywhere: on a dedicated server, NAS, or even in a cloud.
Modern backup software allows for various types of recovery from backup: you can near-instantly restore individual files, application objects, or even entire VMs directly from compressed and deduplicated backups, without running the full VM restore process first.
As critical as backups of our virtual infrastructure are, though, it is important to note, that if we had something happen to multiple virtual machines or perhaps an entire site, we would have to go through the process and time it would take to restore those virtual machines back to production or standby environment. Technologies like Flash VM Boot can restore services extremely quickly for a single virtual machine from backup and produce amazing RTO results. However, in the case of a site failure due to a disaster or multiple VM failures due to the loss of a datastore, restoring multiple VMs from backups of any type would not be practical and is a weakness of restoring from backups.
In short, backups are necessary and extremely important in the environment as they provide the mechanism to go back in time and restore data either in part or full. As noted though, backups pull data from our production environment and store them in a backup repository. To bring that data back, we have to restore the data either back to the original virtual machine or recreate the entire virtual machine from that backup data. This is sufficient for a single VM or a few VMs. How though do we circumvent the time it takes to bring data or services back online in the event of a site-wide failure or severely impaired primary site whether it be hardware or natural disasters, malware, or self-inflicted?
VM replication creates exact copies of source VMs and puts those copies (called VM Replicas) on the target VMware ESXi or Hyper-V host/storage. Each time a replication job runs, VM replicas are updated, so they are identical to the source VMs.
VM replicas are fully-functional VMs which are stored in the powered-off state, so they do not consume compute resources. At the same time, if a disaster strikes, all you have to do to restore is to power on the VM replicas!
The downsides of this approach are that a) you need to invest in an additional DR infrastructure, and b) VM replicas take up way more space than VM backups, so you won’t be able to store as much data and go far back in time should you need to restore your data.
Despite the investment in an offsite DR environment, most consider this investment to be less than the potential cost of loss of business, damaged brand reputation and other damages that may result from having an infrastructure down due to a disaster at a primary location. As noted above, replication provides a way to have a replica of your virtual machines at another site. This provides site resiliency and geographic diversity to your business operations so that you can withstand a total site failure and still have resources available to bring online without restoring from backup. This tremendously bolsters our DR strategy as you are then protected from data loss due to disaster at a particular location. Since your virtual machines are being replicated to a different facility, either close by or another geographic location altogether, you have greatly diminished the chances that you will suffer catastrophic data loss if both your production environment and your production backup environment are destroyed.
By using the functionality and capabilities of both the backup and the replication processes, we are well on our way to having a well-rounded DR plan that we can build upon with other backup solution capabilities. When we look at a good DR plan, backups do not replace replication and vice versa. They complement each other. However, each has strengths, weaknesses, and use cases.
- Keeps many recovery points
- Compressed and deduped, which results in small footprint
- Retains data for long periods of time
- Relatively cheap
- Even with technologies such as flash boot, recovery from backup is not practical with site disaster recovery
- Required for a quick VM recovery
- An exact copy of the source virtual machine
- Fast recovery especially for primary site failure
- Expensive to implement offsite DR facility as replication target
- Can have local replicas for data store and another failure