The SQL Server team continues to make my day. The latest reason for joy (but also consternation because who has time for all of this?) is that PolyBase is now available on Linux as of SQL Server 2019 CTP 2.5.

I decided to give this a shot by building a container. Walk with me here.

Step One: SQL on Linux

I have Docker for Windows installed in this case but it makes absolutely no difference whether you’re using Windows, Mac, or Linux. Let’s grab our SQL on Linux container.

Before we get started, you want to have at least 4 GB of RAM available to your container. Here’s me giving my container 8 GB of RAM because I am a generous soul:

I kept telling my wife that putting 64 GB of RAM on that desktop would eventually pay off.

If you already have Docker installed, here is a sample:

docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=PolyBaseAllDay!!!11" -p 51433:1433 -d

You probably want a better sa password than this, but whatever. I’m also using port 51433 on my host rather than 1433 because I have SQL Server 2017 installed already on this machine and is listening on port 1433.

One other thing: I’m using double-quotes rather than single-quotes. You’ll want to use double-quotes for cmd.exe and PowerShell. You can get away with single quotes on Linux.

By the way, if you are on Windows and get the following error, it means you’re on Windows containers and need to switch to Linux containers:

A Windows container? What manner of ill is this man performing on his machine?

It’ll take a little while for everything to download, so you might need to occupy yourself in some manner in the meantime. Maybe take up a hobby like creating a Markov chain generator for generating postmodern blog posts. Warning: if you get it wrong, one of the articles might actually be readable and that’s how they’ll know you’re a fake.

Let’s pop open Azure Data Studio just to make sure everything loaded properly:

Can’t afford three commas; only in the one-comma club.

Now that we’re set up, I will create a Scratch database as is my wont.


Step Two: Software on SQL on Linux

Now that we have SQL Server on Linux installed, we can begin to install PolyBase. There are some instructions here but because we started with the Docker image, we’ll need to do a little bit of prep work. Let’s get our shell on.

First, run docker ps to figure out your container ID. Mine is 818623137e9f. From there, run the following command, replacing the container ID with a reasonable facsimile of yours.

docker exec -it 818 /bin/bash

Do it correctly and you’ll land at a bash root prompt. Understand that with root power comes disastrous responsibility.

Me at a root prompt. Bonus point: Cities:Skylines has totally ruined my appreciation of SimCity 2000.

We are going to need to set up some stuff first. First, grab the Microsoft keys for apt:

wget -qO- | apt-key add -

If you’re copying from the Microsoft install instructions, remember that you don’t need sudo where you’re going.

Next, we will need to install the 2019 repository. Before we can run add-apt-repository we need to install some tooling. Run the following commands in order:

apt-get update && apt-get upgrade
apt-get install -y software-properties-common
add-apt-repository "$(wget -qO-"
apt-get update && apt-get upgrade

Yes, there is a second run of apt-get update && apt-get upgrade there. We added a new repository and wish to make known to Ubuntu that it should look there.

Step Three: PolyBase on SQL on Linux

We’re close now. Just a few dozen more steps. At this point, you should still be in your bash shell, Master and Commander of the container 818623137e9f (or whatever your container is named). Run the next command with great purpose, for it will smell cowardice and indecision and punish you accordingly:

apt-get install mssql-server-polybase -y

I put the -y at the end to ensure that the machine knows I mean business.

Once I am done here, I relinquish command of the vessel by entering exit and closing my shell, for I have hit the most Microsofty of output messages:

At least I don’t need to reboot, amirite?

Step Four: the Great Sleep

Now that I am back at my DOS prompt, my container awaits its next command. It shall make the ultimate sacrifice for me: I will stop its existence. [I hereby claim my bonus points for making the appropriate distinction between “shall” versus “will” in this post.]

docker stop 818

But wait! Before it has a chance to dream of boxy sheep, I enlist its assistance once more and cause it to spring from its own ashes. Also, I guess I burned it in the in-between but no matter, for we have work to do!

docker restart 818

Now that it is awakened and has returned with a passion (and a new paint job to cover the char), we connect directly to the Scratch database and enable PolyBase:

EXEC sp_configure 'polybase enabled', 1
And the people rejoice.

Step Five: the Wrapup

I won’t have any demonstrations of PolyBase because I’ve done this a few times already, but it does work for the PolyBase V2 sources: SQL Server, Oracle, MongoDB, Cosmos DB, Teradata, and ODBC connections.

Coda: the Limitations

There are some limitations around PolyBase for SQL Server on Linux. Here are a couple that I’ve seen in CTP 2.5:

  • PolyBase on Linux does not support scale-out clusters at this time.
  • PolyBase on Linux does not support connections to Hadoop or Azure Blob Storage at this time. If you do try to set up an external table, you will get the following error:
Operation PolyBase Garden: a Remote Java Bridge too far.

I don’t know which (if any) are just because this is the first iteration and which are permanent limitations, but keep in mind that there are differences in functionality here and that some of these differences might disappear in future versions of PolyBase on SQL on Linux.

3 thoughts on “PolyBase on Linux with CTP 2.5

  1. Thanks for this great post!

    I did have some issue with the installation of PolyBase, though.

    To connect to the docker I used:
    docker exec -it –user root sql1 /bin/bash

    gnupg was not installed, so from within the bash:
    apt-get update && apt-get install -y gnupg2

    The repository points to a preview version of PolyBase, then you’ll get a message that the evaluation period has expired, so I switched to:
    add-apt-repository “$(wget -qO-”

    And then it works… (at least till the last statement of the manual above)

    1. Thanks for reaching out and for the updates. This post was based on CTP 2.5 of SQL Server 2019, so it was a pre-release distro. Your change looks like a good one, and it’s probably worth me revising this post now that 2019 is out and stable.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s