The SQL Server team continues to make my day. The latest reason for joy (but also consternation because who has time for all of this?) is that PolyBase is now available on Linux as of SQL Server 2019 CTP 2.5.
I decided to give this a shot by building a container. Walk with me here.
Step One: SQL on Linux
I have Docker for Windows installed in this case but it makes absolutely no difference whether you’re using Windows, Mac, or Linux. Let’s grab our SQL on Linux container.
Before we get started, you want to have at least 4 GB of RAM available to your container. Here’s me giving my container 8 GB of RAM because I am a generous soul:
If you already have Docker installed, here is a sample:
docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=PolyBaseAllDay!!!11" -p 51433:1433 -d mcr.microsoft.com/mssql/server:2019-CTP2.5-ubuntu
You probably want a better sa password than this, but whatever. I’m also using port 51433 on my host rather than 1433 because I have SQL Server 2017 installed already on this machine and is listening on port 1433.
One other thing: I’m using double-quotes rather than single-quotes. You’ll want to use double-quotes for cmd.exe and PowerShell. You can get away with single quotes on Linux.
By the way, if you are on Windows and get the following error, it means you’re on Windows containers and need to switch to Linux containers:
It’ll take a little while for everything to download, so you might need to occupy yourself in some manner in the meantime. Maybe take up a hobby like creating a Markov chain generator for generating postmodern blog posts. Warning: if you get it wrong, one of the articles might actually be readable and that’s how they’ll know you’re a fake.
Let’s pop open Azure Data Studio just to make sure everything loaded properly:
Now that we’re set up, I will create a Scratch database as is my wont.
CREATE DATABASE [Scratch] GO
Step Two: Software on SQL on Linux
Now that we have SQL Server on Linux installed, we can begin to install PolyBase. There are some instructions here but because we started with the Docker image, we’ll need to do a little bit of prep work. Let’s get our shell on.
docker ps to figure out your container ID. Mine is
818623137e9f. From there, run the following command, replacing the container ID with a reasonable facsimile of yours.
docker exec -it 818 /bin/bash
Do it correctly and you’ll land at a bash root prompt. Understand that with root power comes disastrous responsibility.
We are going to need to set up some stuff first. First, grab the Microsoft keys for apt:
wget -qO- https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
If you’re copying from the Microsoft install instructions, remember that you don’t need sudo where you’re going.
Next, we will need to install the 2019 repository. Before we can run
add-apt-repository we need to install some tooling. Run the following commands in order:
apt-get update && apt-get upgrade apt-get install -y software-properties-common add-apt-repository "$(wget -qO- https://packages.microsoft.com/config/ubuntu/16.04/mssql-server-preview.list)" apt-get update && apt-get upgrade
Yes, there is a second run of
apt-get update && apt-get upgrade there. We added a new repository and wish to make known to Ubuntu that it should look there.
Step Three: PolyBase on SQL on Linux
We’re close now. Just a few
dozen more steps. At this point, you should still be in your bash shell, Master and Commander of the container 818623137e9f (or whatever your container is named). Run the next command with great purpose, for it will smell cowardice and indecision and punish you accordingly:
apt-get install mssql-server-polybase -y
I put the
-y at the end to ensure that the machine knows I mean business.
Once I am done here, I relinquish command of the vessel by entering
exit and closing my shell, for I have hit the most Microsofty of output messages:
Step Four: the Great Sleep
Now that I am back at my DOS prompt, my container awaits its next command. It shall make the ultimate sacrifice for me: I will stop its existence. [I hereby claim my bonus points for making the appropriate distinction between “shall” versus “will” in this post.]
docker stop 818
But wait! Before it has a chance to dream of boxy sheep, I enlist its assistance once more and cause it to spring from its own ashes. Also, I guess I burned it in the in-between but no matter, for we have work to do!
docker restart 818
Now that it is awakened and has returned with a passion (and a new paint job to cover the char), we connect directly to the Scratch database and enable PolyBase:
EXEC sp_configure 'polybase enabled', 1 GO RECONFIGURE GO
Step Five: the Wrapup
I won’t have any demonstrations of PolyBase because I’ve done this a few times already, but it does work for the PolyBase V2 sources: SQL Server, Oracle, MongoDB, Cosmos DB, Teradata, and ODBC connections.
Coda: the Limitations
There are some limitations around PolyBase for SQL Server on Linux. Here are a couple that I’ve seen in CTP 2.5:
- PolyBase on Linux does not support scale-out clusters at this time.
- PolyBase on Linux does not support connections to Hadoop or Azure Blob Storage at this time. If you do try to set up an external table, you will get the following error:
I don’t know which (if any) are just because this is the first iteration and which are permanent limitations, but keep in mind that there are differences in functionality here and that some of these differences might disappear in future versions of PolyBase on SQL on Linux.