The Cost Of Synchronous Mirroring

About a month or so ago, I started dealing with a customer’s performance issues.  When I checked the wait stats using Glenn Berry’s fantastic set of DMV queries, I noticed that 99% of wait stats were around mirroring.  This says that 99% of the time that SQL Server spends waiting to run queries is due to the fact that the primary instance is waiting for the secondary instance to synchronize changes.

The reason that mirroring stats were that high is because my customer is using Standard Edition of SQL Server.  Unfortunately, Standard Edition only allows for synchronous mirroring.  Now, I know that mirroring is deprecated, but my customer didn’t, and until SQL Server 2016 comes out and we get asynchronous (or synchronous) availability groups, they didn’t have many high availability options.

Because this customer was having performance problems, we ended up breaking the mirror.  We did this after I discussed their Recovery Time Objectives and Recovery Point Objectives—that is, how long they can afford to be down and how much data we can afford to lose—and it turned out that synchronous mirroring just wasn’t necessary given the company model and RTO/RPO requirements.  Instead, I bumped up backup frequency and have a medium-term plan to introduce log shipping to reduce recovery time in the event of failure.

But let’s say that this option wasn’t available to me.  Here are other things you can do to improve mirroring performance:

  1. Switch to asynchronous mode.  If you’re using Enterprise Edition, you can switch mirroring to asynchronous mode, which improves performance considerably.  Of course, this comes at the risk of data loss in the event of failure—a transaction can commit on the primary node before it commits on the secondary, so in the event of primary failure immediately after a commit, it’s possible that the secondary doesn’t have that transaction.  If you need your secondary to be synchronous, this isn’t an option.
  2. Improve storage and network subsystems.  In my customer’s case, they’re using a decent NAS.  They’re a small company and don’t need SANs with racks full of SSDs or on-board flash storage, and there’s no way they could afford that.  But if they needed synchronous mirroring, getting those writes to the secondary more quickly would help performance.
  3. Review mirroring.  In an interesting blog post on mirroring, Graham Kent looks at the kind of information he wants when troubleshooting problems with database mirroring, and also points us to Microsoft guidance on the topic.  It’s possible that my customer could have tweaked mirroring somehow to keep it going.

In the end, after shutting off mirroring, we saw a significant performance improvement.  It wasn’t enough and I still needed to modify some code, but this at least helped them through the immediate crisis.  They lost the benefit of having mirrored instances—knowing that if one instance goes down, another can come up very quickly to take over—but because the RTO/RPO requirements were fairly loose, we decided that we could sacrifice this level of security in order to obtain sufficient performance.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s