Azure Storage Account Customer Failover

Last year Microsoft announced Azure storage account failover for GRS, RA-GRS GRS, RA-GRS, GZRS, and RA-GZRS accounts could be controlled by the customer and don’t need to depend on Azure. From June 2020, this service becomes as generally available across all regions. Earlier to this feature update, storage account failover was controlled by Microsoft.

When we perform a failover on the storage account is initiated, the secondary replica of the Storage account becomes the new primary, and the DNS records for all Storage service endpoints—blob, file, queue, and table—are updated to point to this new primary. Once the failover is complete, clients will automatically begin reading from the Storage account and writing data to it in the recent new primary region, with no code changes.

How does it work?

Microsoft will initiate a regional failover, when a region is lost in a significant disaster. The customer- managed failover enabled the entire storage account to the secondary region. The secondary region is selected based on Azure paired regions. Paired regions consist of two regions within the same geography. Customer cannot choose their preferred region as part of the paired region. To refer to the complete list of Azure paired regions, check the Microsoft article.

While the date writes to an Azure storage account, it happens to the primary region, and that data is copied asynchronously to the secondary region. If the primary endpoint is unavailable, the client can no longer write to the storage account. When we initiate a failover to the secondary region, the failover process updates the DNS entry to the secondary endpoint region, which becomes the new primary endpoint.

Please refer to the below Microsoft diagram for the failover process:

Unsupported Features

  • Azure File sync storage account does not support failover; doing so will cause sync to stop working and eventually lead to data loss.
  • Azure Storage account with premium and ADLS Gen2 is not supported at this time.
  • A storage account containing any WORM immutability policy enabled containers cannot be failed over.

Steps to Failover:

Create a storage account with replication type as RAGRS account.

Upload few files to the newly created storage account, under a container. After uploading files, we have to wait till replication happens between the primary and secondary region. We can view the status on storage account properties as per the below screenshot.

Before failover, verify the endpoints available for the storage account.

Once the storage account is ready for failover, click prepare for failover and Geo-replication settings in a storage account. It will prompted with a new screen. Type Yes and click failover.

Azure will prepare for failover, and DNS will be updated to a secondary region. After few mins, the storage account will be available to the Secondary region and be locally redundant.

After failover, the endpoint will be available with one region (secondary region).

Things to note:

  • The failover can be initiated from Azure portal, PowerShell, Azure Cli or API
  • The data loss can be calculated based on the last sync time property. It will be shown during initiate failover or can be viewed by PowerShell or cli commands.
  • After failover storage account is converted to Locally redundant storage (LRS). Customers can change it to GRS redundant, and data will start replication.
  • After a failover of a Storage account containing archive blobs, it needs to be rehydrated to an online tier before the account can be configured for geo-redundancy.
  • The storage account contains VM`s unmanaged disks support failover. VM disk has been leased to virtual machines, so stop the virtual machines and delete the VM by not deleting disks and initiate failover.
  • The storage account configured with read access on the secondary region can be used to copy the data to an alternate location by using AZ Copy or PowerShell.