February 18, 2019
NAS Backup Strategies
NAS (Network Attached Storage) devices are becoming increasingly popular and are typically used as file servers for storing files, as database storage, and for storing virtual machines running on hypervisors. Unlike SAN (Storage Area Network) devices, NAS devices are more affordable and easier to configure. A RAID (Redundant Array of Independent Disks) controller is present on most NAS models to eliminate data redundancy and can help preserve data, should any of disks of the array become corrupted. However, RAID technology is not able to preserve your data if multiple disks are damaged or files are affected after a virus attack. That’s why data stored on NAS should always be backed up. Today’s blog post covers NAS backup and explores how you can perform a backup of data stored on NAS devices including virtual machines (VMs).
General NAS Backup Recommendations
Backing up data from directly attached storage (DAS) is a process that is familiar to many people. Directly attached storage can be in the form of a single hard disk attached via SATA, SAS, SCSI or IDE interface to the disk controller of a computer. Data can be copied to an external USB or eSATA hard disk drive manually or to a network share or backup repository automatically if the appropriate backup software is used.
Backing up data from NAS contains several unique features. The OS installed on NAS is proprietary and provides a web interface for management instead of the usual Linux or Windows operating systems. It can cause difficulties in installing traditional backup software on NAS devices. Let’s start with looking over some general recommendations for backing up data from NAS.
Backup to the Directly Attached USB Hard Disk
Some models of NAS devices have USB or eSATA ports in addition to RJ-45 Ethernet ports used for connecting NAS to the network. You can connect an external hard disk drive to a USB or eSATA port of the NAS device and copy the necessary files from NAS to the external USB or eSATA disk drive. You can manage and copy files with a web interface provided by the NAS vendor.
The disadvantage of this backup method is that it is not automated – you have to attach the disk and initiate a copying operation manually. If any of the files that you need to copy are opened and being modified by some application that accesses these files via the network, then a copy of such files may be inconsistent, and certain files perhaps will not even be copied. Needless to say, backup would not be application-aware in this case.
Backup to a Different NAS
You can back up your data from one NAS to another via the network. If you use a third-party machine connected to shares on both NAS devices for copying data from a source NAS to the destination NAS, the network can become overloaded and the data transfer speed can slow down considerably. Check to see if your NAS supports mounting a remote shared folder that is able to copy data directly from one NAS to another over the network. If your NAS supports creating scheduled tasks, you can try to schedule backing up your data periodically. The disadvantage here is the same as in previous case – data in backup can be inconsistent if the backed up files are opened by applications due to missing application awareness.
Backup to the Cloud
Backing up data from NAS to the cloud is a good choice if you have an account for cloud services such as Amazon cloud, Microsoft Azure/OneDrive cloud, Baidu cloud etc., and if your security policy allows you to store data in a public cloud. If you have a backup in the cloud, your data is protected against disasters that may occur in your physical datacenter. Some NAS vendors provide a built-in feature that allows you to synchronize files and directories on NAS with the cloud storage.
Using the 3-2-1 Backup Strategy
Having a backup stored on external disks, NAS devices or in a public cloud can help you to restore your data. You significantly reduce the probability of data loss, but this probability is not zero, as the risk of your backup becoming corrupted is still present. If you want to feel fully protected, it is better to respect the 3-2-1 backup rule that maximally reduces the probability of data loss. The 3-2-1 backup rule recommends you to have at least three copies of your data, two of which are stored on different media, and one of which is located offsite. Imagine that a disaster like a typhoon destroys your primary site with production servers to which the primary storage is attached. The secondary storage (NAS, for example) that represents local backups is also located onsite and is destroyed. Two local copies of your data located onsite are destroyed. But, according to the 3-2-1 rule, you have a backup stored remotely in a region that is not affected by the typhoon. The third data copy that is the offsite backup allows you to recover precious data. This is how the 3-2-1 backup rule can be used to protect your data.
Features of Protecting NAS Devices
These are some important features that you should be aware of before you perform NAS backup:
Using the NDMP Protocol
NDMP is a Network Data Management Protocol intended for copying data over the network from a source storage to destination storage with more rational network loading when compared to copying data over the network using legacy methods. Management commands and data movement are separated. Management sessions are TCP/IP based and data transferring sessions can be TCP/IP or SAN based. Many NAS devices support NDMP because this protocol is standardized and supports multivendor software integration. The NDMP architecture includes the following main components: the NDMP agent (client), NDMP services, and NDMP sessions.
NDMP is a free (open) protocol and can be used for communicating backup applications, backup devices and servers with each other. The idea is the following: imagine that you have a server with an application used for copying data from one NAS to another. The data backup process is managed on this server. Generally, in this case, data is copied from the source NAS to the destination NAS via a management server that causes high network loading and low data transferring speed. The NDMP protocol is meant to resolve this issue by transferring data from the source NAS to the destination NAS over the network directly without passing the management server. The role of the server with the NDMP agent is only to manage the backup process; the server is not involved in the data transferring process. Thus, the management server is not a transit node for transferring data if solutions that support NDMP are used for data transferring in the framework of a NAS backup. NDMP-capable NAS devices must be used is this case.
While in most cases you cannot install backup software directly on NAS (the NAS manufacturer installs a proprietary operating system on NAS devices), you can install a backup application that supports NDMP on a third-party server and back up data from one NDMP-compatible NAS device to another NDMP-compatible storage device. This approach can be also used for backing up data from NAS to tape.
Block-Level Replication of NAS Devices
One more method of protecting data stored on NAS is performing an array-based replication of NAS devices. Many storage arrays offer replication options at no additional charge. NAS vendors provide special software for replicating disk arrays of NAS devices. When using array-based replication, data is usually replicated on a block level from the source NAS located at the production (primary) site to the second NAS located at the disaster recovery (DR) site. The identical or near-identical NAS devices must be used at both sites.
The advantages of array-based replication are the following: Data is replicated from one NAS to another, bypassing the servers that use NAS devices as a datastore, thereby offloading the replication work from servers to storage devices. Servers (hosts) are not required at the DR site because data can be transferred directly from one NAS to another over the network.
Disadvantages are also present for this method. When files are opened by applications and data is being written to these files, zeros and ones are written to new blocks (or are overwritten to existing blocks). Software that performs array-based replication on the block level copies only blocks changed since previous replication job (for asynchronous replication) and is not application-aware. If at the beginning of the block level replication process, blocks that belong to a file were copied successfully but then were changed before replication finished, the destination file would be broken. Backup software is not aware that the blocks belonging to that file were changed after being partially copied. Hence, you cannot be sure that your data will be consistent after recovery in this case. This issue is critical for backing up running databases, running virtual machines and other opened files.
You need to have the identical NAS device on a remote site even if you don’t need such performance provided by that NAS device. As a result, you need to pay more for hardware for deploying an array-based replication solution and costs per device may be high. Replication allows you to restore workloads faster but data cannot be compressed, which also increases costs.
Today’s blog post has covered NAS backup and general NAS backup methods as well as explained how to backup NAS devices used for storing virtual machines on the example of a VMware vSphere virtual environment. The best recommendation is to use special backup solutions optimized for backing up VM data stored on NAS. NAKIVO Backup & Replication can be installed on NAS and can back up your VMs stored on NAS as well as create copies of VM backups located on NAS in the best way possible.