Active Geo Replication is a new feature and not many of you are aware about the internals, it’s not that hard to configure it using Azure portal or to initiate the failover however, there are quite a few things which you must be aware of before using this for your most critical databases. Lack of this knowledge might result into data loss. So let’s look at the Active Geo Replication Deepdive.
Active Geo Replication deepdive
As soon as you create a secondary database using Active Geo Replication, the secondary DBs are seeded (initialized) with current state of the primary DB. Once the secondary is in sync with Primary DB Active Geo Replication starts replicating committed transactions from primary to secondary. Only after the first seeding the secondary database is ready to failover and switch the role with primary. Secondary databases are also protected locally using normal HA system just like your standard databases. Since the active Geo-replication is asynchronous in nature, primary doesn’t wait for secondary to commit the transaction before committing & acknowledging the end-user. So in other words Primary is not blocked while waiting for this to occur. Changes are buffered making the replication system resilient to temporary connection problems or high latency when replicating to a distant locations.
Secondary databases can be either readable or non-readable, you can choose the readability while configuring the Active Geo Replication. This enables you to use your secondary databases to serve independent read-only workloads. You can also use it to load balance complex query workloads across multiple databases or to provide lower latency data access to applications in other parts of the world. this is no the only thing you need to know as Active Geo Replication Deepdive blog, there is more to it.
Replication relationships are manually managed, it’s up to you to decide when to terminate this relationship & you can do it at any point. If you terminate from the primary then you can choose whether to terminate immediately and lose any pending transactions or to terminate after applying all pending transactions. Many people think this is just like AlwaysOn so failover will also happen automatically if there is a datacenter outage but unfortunately this is not true. If there is a datacenter outage which is affecting your primary database, failover is still a manual task and you need to initiate the failover on your own. Keep in mind, in this case terminating the relationship will happen from secondary database because the primary database is not available. Terminating from secondary is always immediate and you will lose (data) all those transactions which are not yet replicated from primary to secondary.
Now the point arises how much data loss will be there?
Well the answer is, it depends upon the number of transactions running on primary database just before the outage and you will get all those transactions for which buffering was already done across the connection before the outage. The decision to terminate the replication should balance your concern for possible data loss and your desire to get application back up again. Don’t forget, once you terminate the relationship to a secondary database it becomes a normal read-only database. Now you can failover your applications which will have complete read-write access to this database. Now you must update the connection string with the latest server name and database name.
Since the application is critical and you don’t have Active Geo Replication, post the fail-over you might want to reconstruct the same type of Active Geo Replication you’ve been using prior of disaster with the new primary database. There are however other steps also involved like managing security…logins, firewall etc. we’ll discuss all these things in our next blog post.
Hope you got answer to your questions regarding Active Geo Replication Deepdive process.
Happy Learning! feel free to leave a comment. 🙂