July 31, 2020
How Amazon Storage Gateway Works: Complete Walkthrough
Amazon Web Services (AWS) is used by many users and organizations given its scalability, reliability, and other advantages. When migrating data to the AWS cloud, you should take into account certain features. By default, Amazon provides the web interface for managing the cloud environment and uploading/downloading files. However, using the web interface for regular uploading of high amounts of data may be inconvenient. If you use Amazon S3 as the cloud storage for your data, you can mount an S3 bucket as a network disk in your operating system.
Mounting a bucket as a disk can significantly simplify copying files to the Amazon cloud. This solution is suitable for small companies and separate users, but if you are going to use Amazon cloud storage for large enterprise environments, you need a more scalable solution. Another aspect is that rebuilding an entire infrastructure is not an easy process if most processes are related to physical servers in the data centers of your company. Fortunately, Amazon provides a special tool that allows you to use your traditional physical infrastructure to copy data to the Amazon cloud or from the Amazon cloud. This tool is called AWS Storage Gateway and this blog post explains how to use and configure Amazon Storage Gateway.
What Is AWS Storage Gateway?
AWS Storage Gateway is a special solution that acts like a bridge between your traditional physical or virtual machines and cloud storage in AWS. It provides seamless integration between on-premises and cloud environments. AWS Storage Gateway provides access to unlimited Amazon S3 storage, Amazon S3 Glacier, Amazon S3 Glacier Deep Archive, and Amazon EBS (Elastic Block Storage). This concept of integrating on-premises storage and cloud storage can also be called hybrid storage. An internet connection is required for Amazon Storage Gateway because a connection to AWS servers must be established.
Types of Storage Gateway
There are three types of AWS Storage Gateway: File Gateway, Volume Gateway and Tape Gateway.
File Gateway. This storage gateway type provides access to files that are stored as objects in an Amazon S3 bucket by using SMB (versions 2 and 3 of the CIFS protocol) and NFS shares (protocol version 3 and 4.1). An SMB (Server Message Block) or NFS (Network File System) mount point must be configured in your operating system to be used to access files/objects in an S3 bucket.
The File Gateway supports the following Amazon S3 storage classes: S3 Standard, S3 Standard-Infrequent Access (IA), and S3 One Zone-IA. Versioning is supported – you can edit, delete, and rename files by accessing them via the NFS or SMB protocols and each file modification is stored as a new version in an S3 bucket. The main advantage of using versioning for a file (object) share is extended recovery capabilities. In addition to versioning, you can enable lifecycle management and cross-region replication for objects stored in Amazon S3.
You can deploy one Storage Gateway VM on server 1 of data center 1 and one Storage Gateway VM on server 2 of data center 2. If both gateways are connected to the same bucket and both servers are connected to the appropriate storage gateway, then you can upload a file to the S3 bucket from server 1 and see that file on server 2 by using the NFS or SMB share. This is possible thanks to the RefreshCache API call that initiates a re-inventory on File Gateway 2.
Volume Gateway allows your servers and applications running on-premises to connect to the AWS block storage (EBS volumes) in the cloud by using the iSCSI protocol (Internet Small Computer Systems Interface). While SMB and NFS used by a file gateway are file level sharing protocols, iSCSI works at the block level. There are two types of Volume Gateway – Stored Volumes and Cached Volumes.
Stored Volumes. Your local storage, such as a hard disk drive on a physical server or a virtual disk of a virtual machine, is used as the main data storage. Asynchronous backup is performed to Amazon S3 as EBS snapshots. You can access storage with low latency when using Stored Volumes. The size of stored volumes can be between 1 TB and 16 TB. A stored volume is mounted as an iSCSI device.
Cached Volumes (Cached Gateway). Frequently accessed data is stored in EBS volumes and infrequently used data is migrated to Amazon S3. This approach is more cost-effective because the price for using Amazon S3 buckets is lower than the price for using EBS volumes. The maximum size of a volume can be 32 TB. A volume is mounted as an iSCSI device.
When using a Volume Gateway for block storage, volumes can be attached to or detached from the Volume Gateway. This feature allows you to migrate volumes between gateways for upgrading storage hardware on local servers (on-premises), for example.
Tape Gateway is used to back up data for long term archival to Amazon Glacier and store that data on virtual tapes. In fact, data is stored in Amazon S3 Glacier or Amazon S3 Glacier Deep Archive. In this case, the physical interface used to write data on tapes by connecting tape drives and tape libraries is replaced with a compatible Tape Gateway Library interface that allows you to store data in the Amazon cloud. The iSCSI protocol is used to connect existing backup devices to the Tape Gateway. Existing backup configuration and workflow can be preserved. You can save data to the cloud directly via the Tape Gateway or by using specialized data backup applications.
Tape gateways can be used to back up data without making significant changes to an existing backup configuration or as an alternative to physical tape drives and libraries (which are not cost-effective).
Supported Host Platforms
AWS Storage Gateway is provided as a virtual appliance (a virtual machine image/template) that can be deployed on different platforms.
Supported host virtualization platforms are:
However, there is also a hardware appliance that can be used if an organization doesn’t have any hypervisors in its infrastructure. You can purchase the hardware Amazon Storage Gateway appliance on the Amazon website and it will be delivered to you.
The pricing policy of Amazon for the provided cloud services is pay for what you use. Amazon Storage Gateway is not an exception. The charging depends on the type of storage – Amazon S3 or EBS and the AWS Region. If you store data in Amazon S3, the price depends on the S3 Storage Class and number of requests. The price is calculated per GB per month. If you store data in EBS volumes, snapshots are billed if they are taken. As for Amazon Storage Gateway, you are charged for gateway usage (per gateway per month). You can check the actual prices on the Amazon website at any time.
Advantages of AWS Storage Gateway
The main advantages of using Amazon Storage Gateway are:
- Integration of hardware and software configurations with no hardware changes.
- The ability to use on-premises storage and cloud storage in Amazon (the hybrid storage concept).
- Smooth migration from physical infrastructure to the AWS cloud.
How to Deploy AWS Storage Gateway?
Let’s find out how to deploy AWS Storage Gateway to access files stored as objects in Amazon S3 by connecting to a file gateway via NFS. You should have an AWS account and an ESXi host to run the Storage Gateway VM.
Downloading the image
Open the web interface of the AWS console.
Click Services and select Storage Gateway in the Storage category.
On the Storage Gateway page, click the Create Gateway button in the Gateways section of the navigation pane. On this page you can find previously created storage gateways if they are present.
The Create gateway wizard is opened.
Select gateway type. Select File gateway to store files as objects in Amazon S3. Click Next at each step of the wizard to continue.
Select host platform. There are five supported host platforms to deploy. Select VMware ESXi and click the Download image button.
Save the file. In our case the name of the downloaded file is aws-storage-gateway-latest.ova and we save this file to D:\virtual\ on a local machine. Don’t close the current browser tab with the Create gateway wizard displayed in the web interface of AWS because you will need to continue configuring the storage gateway from this step later.
Deploying the virtual appliance on an ESXi host
Now you have to deploy the downloaded aws-storage-gateway-latest.ova template file on an ESXi host. In our example ESXi hosts are managed by vCenter and we will use the web interface of VMware vSphere Client to deploy the AWS Storage Gateway virtual appliance.
Connect to your vCenter, go to Hosts and Clusters and select the needed ESXi host that has enough free resources.
Requirements: The File Gateway requires 16 GB of RAM, 4 virtual processors (vCPUs), one 80-GB virtual disk and one additional 150-GB virtual disk for storage cache.
In our example, we select the host with IP address 10.10.10.90. After selecting the ESXi host in VSphere Client, click Actions > Deploy OVF Template.
The Deploy OVF Template wizard opens.
1. Select an OVF template. Select Local file and click Browse to select the downloaded ova file. In this case we select aws-storage-gateway-latest.ova in D:\virtual\. Hit Next at each step of the wizard to continue.
2. Select a name and folder. Enter a virtual machine name, for example, aws-storage-gateway and select a location for the VM in vCenter.
3. Select a compute resource. Select an ESXi host where the Storage Gateway VM will run. We select 10.10.10.90 in this example.
4. Review details. Verify and review the template details of the AWS Storage Gateway virtual appliance you are about to deploy.
5. Select storage. Select the data store with enough free space to store virtual disk files and other VM files. Select the virtual disk format. It is recommended to select Thick Provisioned as a virtual disk format because all storage space needed for a virtual disk to function is allocated immediately. Read more about thick and thin provisioning in this blog post.
6. Select networks. Select a vSwitch that is connected to a router and provides an internet connection. A virtual network adapter of the virtual machine will be connected to this vSwitch and the appropriate network after deployment.
7. Ready to complete. Review the configuration of the VM that will be deployed from the template and hit Finish to start VM creation. Read the blog post about VM templates to learn more.
Wait until the Storage Gateway VM is deployed from the template. You can see the job status in the Recent Tasks toolbar in vSphere Client.
Once the VM is deployed, you can see the VM name you have defined before in the list of VMs of the appropriate ESXi host (10.10.10.90 in our case).
Right click the VM (aws-storage-gateway is the name of the Storage Gateway VM deployed from the template in this example) and in the context menu hit Edit Settings.
Now you have to add a new virtual hard disk for cache. This virtual disk is used to store recently accessed files and files that are accessed frequently to reduce latency when accessing that data.
In the Virtual Hardware tab of the Edit Settings window click Add new device and select Hard disk.
The recommended minimum size for a virtual hard disk used to store cache by AWS Storage Gateway is 150 GB. You should create a Thick Provisioned virtual disk for cache. In the New Hard disk string, type 150 GB. In the Disk Provisioning string, select one of the Thick Provisioning options. Hit OK to save settings and create a virtual disk.
Make sure that time is set correctly on the Storage Gateway VM, ESXi hosts, and vCenter servers. Time on the VM must be synchronized to avoid issues and for successful gateway activation.
Select your aws-storage-gateway VM in the list of VMs, click Edit Settings. On the Edit Settings screen select the VM Options tab, click VMware Tools to expand settings, and select the “Synchronize guest time with host” checkbox. Hit OK to save settings.
Testing network connectivity
It is recommended to test the network connection of the Amazon Storage Gateway running as a VM locally with AWS cloud storage.
Power on the Storage Gateway VM.
Log into the AWS Appliance VM by using the default credentials.
You can check the IP address. If there is a DHCP server in your network, the IP configuration is obtained automatically. It is recommended to set the static IP address for long-term usage of the Amazon Storage Gateway.
If you want to set the static IP address, press 2:
2: Network Configuration
Then enter 3:
3: Configure Static IP
Follow the recommendations to set the static IP address.
In our example, the IP address of the Storage Gateway virtual appliance is 192.168.17.122 and the netmask is 255.255.255.0.
After configuring the IP network configuration, you should test the network connectivity.
In the main menu select 3:
3: Test Network Connectivity
Then select 1:
Select endpoint type:
The test is passed in our case and you can see it on the screenshot below.
Creating a bucket
Before you can continue, ensure that a bucket has been created in Amazon S3 for your account. You can use this link to create a bucket. The name of the bucket used in this walkthrough is blog-bucket01. You must have enough permissions for your AWS account. AWS access keys should be generated if you are going to use other applications to access the bucket. You can get the AWS access key ID and a secret access key for your account on this AWS page.
File gateway activation
Now you have to define the IP address of Amazon Storage Gateway and activate the File Gateway.
Go back to the AWS console web interface. As you recall we stopped at the second step of the Create gateway wizard. If you have closed that page, open the AWS console, go to Services > Storage Gateway, click Create Gateway. On the first step of the wizard (Select gateway type) select File gateway. These steps are explained above in the beginning of the walkthrough and are complemented with screenshots.
Select host platform. Select VMware ESXi and click Next.
Select service endpoint. Select Public as the endpoint type, then hit Next. The network access for your web browser to the Storage Gateway VM to the TCP port 80 must be allowed.
Check the IP address of the Gateway VM (VA). You can find the IP address of the VM in the interface of VMware vSphere Client by selecting the needed virtual machine. In this example, the internal IP address of the AWS Storage Gateway VM is 192.168.17.122.
Enter the IP address of the VM (the Storage Gateway virtual appliance), not the external (WAN) IP of your router.
Click Connect to gateway.
Activate gateway. Activation of the gateway securely associates your gateway with your AWS account.
Select the gateway time zone.
Enter the gateway name, for example Storage Gateway AWS. The name can be different from the name of the VM and the DNS name of the VM (appliance). Remember that TCP 80 port must be opened on the gateway VM.
Click Activate gateway and wait until the cache disks are identified.
Configure local disks. Ensure that your 150-GB virtual disk is allocated to cache. Then hit Configure logging.
Configure logging. Logging provides you with additional abilities for troubleshooting and audit. Select Create a new log group. Click Verify VMware HA.
Click Verify VMware HA if your Storage Gateway VM is running in the VMware High Availability cluster. We hit Exit as the Amazon Storage Gateway virtual appliance is not deployed in the VMware HA cluster in our example.
Now the File Gateway has been successfully created and it is running.
Creating a file share
It’s time to create a file share in order to connect to a bucket by using standard NFS or SMB (CIFS) protocols. Let’s configure the connection to an Amazon S3 bucket via NFS.
Select your File Gateway and click Create file share (see the screenshot above).
Enter the S3 bucket name. The name of the bucket used in this example is blog-bucket01.
Access objects using: Network File System (NFS).
Gateway: Select the deployed S3 Storage Gateway in the drop-down list.
Adding tags is optional and can be skipped. Hit Next to continue.
Storage. Configure how files will be stored in Amazon S3.
Amazon S3 bucket name: blog-bucket01
Storage class for new objects: S3 Standard
Object metadata: Guess MIME type and Give bucket owner full control checkboxes must be selected.
Access to your S3 bucket. Create a new IAM role.
Encryption: S3 managed keys (SSE-S3).
Review. You can leave the default values for the IAM role and other values except the following values.
Allowed clients: 0.0.0.0/0 – by default access from any IP address is allowed. It is recommended to define custom allowed IP addresses for security reasons.
Squash level. Click Edit in the Mount options and select All squash to make sure that everything will work properly.
Click Create file share.
If the Create file share button is not active, click Previous and then click Next again.
The NFS file share is created on your file gateway. In the bottom of the File shares page you can see examples of commands that can be used to mount your file share to Linux, Windows and macOS.
Connecting to the file share
Let’s create a directory that will be used as the mount point on a Linux machine (Ubuntu 18.04) and set the needed permissions. In this example, the name of our Linux user account is user1.
Set the owner and permissions for the created directory:
chown user1:user1 /mnt/s3-gateway
chmod 0775 /mnt/s3-gateway
Mount the NFS share provided by the AWS Storage Gateway:
sudo mount -t nfs -o nolock,hard 192.168.17.122:/blog-bucket01 /mnt/s3-gateway
An error can occur:
mount: /mnt/s3-gateway: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program
In this case try to install the nfs-common package:
sudo apt install nfs-common
Then run the command:
sudo mount -t nfs -o nolock,hard 192.168.17.122:/blog-bucket01 /mnt/s3-gateway
You can check whether the S3 bucket (blog-bucket01) is mounted as the NFS share provided by AWS Storage Gateway to /mnt/s3/gateway/ with the commands:
mount | grep gateway
ls -al /mnt/s3-gateway/
As you can see on the screenshot below, the bucket is mounted successfully.
The same content of the bucket is displayed in the web interface of AWS. You can ensure that everything is working correctly and copy a file in the Linux console to the bucket and then check the contents of the bucket in the web interface of AWS.
You can configure auto mount on Linux boot by editing /etc/fstab.
Similarly, you can configure an SMB (CIFS) share on your AWS Storage Gateway and mount that share in different operating systems. If you select the SMB option for a file gateway, it is possible to add your AWS Storage Gateway to an Active Directory domain.
Amazon Storage Gateway is a hybrid cloud solution that allows you to use your current physical and virtual infrastructure with Amazon cloud storage without significant changes to your current hardware and software configuration. Standard storage protocols are used – SMB and NFS are used to provide access to files stored as objects in Amazon S3 on a file level and access to block storage (Amazon EBS volumes) is provided by using iSCSI. You can connect to virtual tape libraries in Amazon S3 via iSCSI instead of using physical tape libraries. This blog post has covered the working principle of Amazon Storage Gateway and explained how you can deploy the File Gateway on VMware ESXi and connect to an Amazon S3 bucket through the File gateway via NFS from Ubuntu Linux.
Amazon Storage Gateway can be used to copy your data backups to AWS manually or with special backup solutions that can work with NFS, SMB or iSCSI protocols. NAKIVO Backup & Replication is a universal data protection solution that allows you to back up data to Amazon S3 and Amazon EBS. NAKIVO Backup & Replication can back up data to Amazon S3 directly without using AWS Storage Gateway. Download the free trial and perform AWS EC2 backup and backup to Amazon S3 in your organization.