Thoughts On In-Memory OLTP

A Steve Jones editorial from a few weeks ago got me thinking about In-Memory OLTP.  We implemented this in our data warehouse about a month ago to mixed success.  Here are my notes from that forum post as well as a couple of areas in which I figured I could expand upon my comments there.

We use memory-optimized objects in two places: one staging table to load data warehouse fact tables, and various memory-optimized table-valued parameters as part of warehouse loading (streaming certain dimension rows from .NET code into our warehouse).

My original plan was to use memory-optimized queue tables in our OLTP system to feed memory-optimized TVPs into memory-optimized staging tables, and possibly convert a couple of our more heavily-read OLTP tables to become memory-optimized in the process. That fell apart for a few reasons.

The best (i.e., most nefarious) problem that we’ve had with In-Memory OLTP is XTP_CHECKPOINT issues. This happened even before we had a single memory-optimized object; we just had the filegroup. The first time this happened in production, we suffered a 10-hour downtime on a warehouse node and had to re-queue a lot of data because the transaction log had filled up and we couldn’t do a thing due to this XTP_CHECKPOINT.  We ended up recovering to shortly before the database failure and didn’t lose any data, but needless to say, causing a database failure before even a single memory-optimized object was created was…problematic…  We opened a support case with Microsoft and that case is still open.  As best as we can tell, if the transaction log fills up and autogrows at any point in time, the XTP checkpoint mechanism stops and never recovers, meaning that the In-Memory checkpoint mechanism never actually checkpoints and thus your log continues to fill up until you take the database offline and force it to reconcile the log.  Our operations team has been on the lookout for these situations since then and have prevented another disaster, but it’s one more thing that the team needs to consider,

Aside from that, the hard limitation on ~8000 bytes per row and no LOB meant that some of the places where I might have wanted to use a memory-optimized TVP (like TVPs for some of the larger dimensions) wouldn’t work out.

We ended up not pushing the memory-optimized filegroup out to production on our OLTP instances because you cannot take a database snapshot if the database has a memory-optimized filegroup, and we use database snapshots for some system processes. On the OLAP side, we did implement memory-optimized objects, and saw significant performance improvements. The TVPs were significantly faster than their on-disk counterparts and our staging table–which previously had been one of the top sources of deadlocks–zooms. Unfortunately, even with these performance benefits, the limitations have been too strict for adoption across our platform. Even if I got the go-ahead to move memory-optimized objects to our production OLTP system (and not just our OLAP system), there would have been another problem preventing general adoption: the inability, within an explicit transaction, to use memory-optimized tables as well as tables in external databases on the same instance. We use replication to push static data around in our environment, and all of that static data goes into one database on each instance. Many of the queries whose performance I wanted to improve join to this static data database, and I did not want to put all of the in-memory work in a separate transaction from the static data work.

I’m still hopeful that 2016 will solve at least some of these problems and make memory-optimized objects practicable, but to be honest, even coming in knowing most of the code-writing limitations of memory-optimized objects and natively compiled stored procedures (of which we have zero), I think In-Memory OLTP really burned me, and even if 2016’s version is a lot better, I’d have to think long and hard before building a project around it.

Upcoming Events

Over the next six weeks, I will be speaking four times.  All of these talks are free and open to the public—although SQL Saturdays do require payment for lunch if you choose.

TriNUG Data SIG

On Wednesday, June 17th, I will speak at the TriNUG Data SIG (link forthcoming).  My topic is Working Effectively with Legacy SQL.

SQL Saturday Chattanooga

I will speak at SQL Saturday Chattanooga on Saturday, June 27th.  My session begins at 4:15 PM, and the topic is Much Ado About Hadoop.

SQL Saturday Columbus

I will speak at SQL Saturday Columbus on Saturday, July 11th.  My session begins at 1 PM and I’m also giving my Hadoop talk there.

TriNUG F#/Analytics Group

On Tuesday, July 28th, I will give a modified version of my Hadoop talk to the F#/Analytics group.

Why We Can’t Have Good Things: In-Memory OLTP

I’ve just recently implemented the first memory-optimized tables at our company.  My original plan was to create a few memory-optimized staging tables on our transactional systems, one memory-optimized staging table on the warehouse side, and several memory-optimized TVPs to help with shuttling data around.

This plan quickly fell apart.  Here’s a quick summary of the problems I ran into during the sprint.  I knew about many of them, but there were a couple fascinating new issues I had not yet seen.

  1. Memory-optimized tables are limited to 8000 bytes per row.  Some of my dimensions have LOB data (a memory-optimized no-no) and other things which are not allowed.  That meant that a couple of my TVPs were going to stay on-disk.
  2. I created my memory-optimized queue tables for staging, and we use triggers to load these queue tables.  This isn’t a great thing overall—I’m eventually going to replace the triggers with Service Broker messages—but it’s what I have to deal with.  Anyhow, I ran into a very interesting error message in stored procedures which eventually call triggers which insert into the memory-optimized tables:  “A user transaction that accesses memory optimized tables or natively compiled procedures cannot access more than one user database or databases model and msdb, and it cannot write to master.”  I got this message because our transactional procedures often join to a replicated database to get static data that we need, and that causes an error.  The way around this problem is to use separate transactions, but I don’t really want to do this.  This ruined my using memory-optimized tables on the OLTP systems.
  3. On one of our instances with a very small, fixed transaction log, we found a very interesting problem:  the transaction log was not truncating due to XTP_CHECKPOINT and because it was full, we could not make any alterations due to the log being full.  Oh, “any alterations” includes expanding the log size…  There is an open Connect item which basically says that the message may or may not be a lie.  One person mentioned that taking a full backup fixed the problem, but in our case, we had to restart the database engine service and re-size the log.
  4. As a side note, you cannot take database snapshots if you have a memory-optimized filegroup.  You also cannot drop the memory-optimized filegroup unless you drop the entire database.  This means that, in practice, you cannot get rid of a memory-optimized filegroup once you create it, and once you create it, you’ve limited your available DBA options.

All of this has made In-Memory OLTP in SQL Server 2014 less than ideal for a real production environment.  I’m hoping that 2016 has significant enough improvements that we can expand our use of In-Memory OLTP, but for now, our initial scenario has collapsed down to a few memory-optimized TVPs and one staging table on our warehouse instances.

Time To Learn R

The time has come for data professionals to learn R.  Microsoft has purchased Revolution Analytics and is betting that they can make R faster, less memory-intensive, and better for enterprise applications.

SQL Server 2016 is also allowing us to run R code against our SQL databases.  The upside to this is fantastic:  instead of pulling SQL data out via ODBC and manipulating it, we can push the (much smaller) R code into SQL Server and process data locally.

So why do you want to get in on this right now?  The reason is that more complementary skills improves your chances of a successful career (where “successful” can mean more money, better benefits, the dress code or hours you want to work, etc.).  Let’s say that you’re a good database administrator who knows SQL Server pretty well.  That’s a nice skill to have and can land you a decent position.  But now let’s say that you’re also really good with statistics and can use tools like R to perform data analysis.  At this point, you’ve moved beyond “good database administrator” and into “really good data professional.”  Guess which one’s more likely to land you that fantastic job offer.

R won’t be everyone’s cup of tea; a lot of people have trouble with or simply don’t like statistical analysis, and in that case, go pick up a different complementary skill.  But if you’re already doing a lot of data analysis on the cheap (queries, Excel spreadsheets, and reports for management), you’re part of the way to learning a valuable skill, and with everything integrating together in 2016, this is a golden opportunity to get ahead of the game.

New Events Coming Up

This has been an exciting year so far.  In May, I spoke at two SQL Saturdays, and in June, I will speak at another two.

First, on June 13th, I will be in Halifax, Nova Scotia for SQL Saturday #404:  Event Not Found.  This is the first time I’ll give my Working Effectively with Legacy SQL talk at a SQL Saturday.

Then, on June 27th, I will be in Chattanooga, Tennessee for SQL Saturday #410.  At that talk, I will get to talk about Hadoop.