I make fairly heavy use of SQL Server Machine Learning Services in my work environment, and this includes using TensorFlow and Keras (in R and Python) to train neural networks and generate predictions against them.
With the upgrade to SQL Server 2019, the security model around Machine Learning Services has changed, restricting what you are able to do with the product. On the whole, that’s a good thing—in 2016 and 2017, you have some pretty good leeway to execute arbitrary code on a server, but they really limit that with 2019.
The purpose of today’s post is to show you how you can install R-Keras and TensorFlow against a server running SQL Server 2019 Machine Learning Services. If you want to install this for SQL Server R Services in 2016 or SQL Server 2017 Machine Learning Services, the process is a little bit easier.
A Major Caveat
There might be a better way to do this than what I’m going to show you. If so, it wouldn’t be the first time that I brute-forced a solution with Machine Learning Services that had a more elegant solution. But I can confirm that it works for me.
Step One: Prep Work
I’m going to assume you know how to install SQL Server ML Services. If you don’t, check out my presentation on ML Services in Production.
Tear Down This Firewall
You might need to disable a firewall rule which reads something like “Block network access for R local user accounts on SQL Server instance MSSQLSERVER.” This rule prohibits outbound web connections, making exfiltration of data harder. Unfortunately, it also makes package updating more difficult and my recollection was that it prevented installation of Keras, though I didn’t test this in writing the blog post, so I could be making that last part up. If your security team gets paranoid about disabling this firewall rule, turn it back on after installation and configuration is finished, though that does mean you won’t be able to do things like, say, hit an API endpoint with R or Python.
Next, we will want to install the Anaconda distribution of Python. This is true even if you have Python Services installed. I was never able to install TensorFlow + Keras on Python Services and have R Services configured to point over to Python Services to get R-Keras working correctly. Maybe I was doing something wrong, but installing another version of Anaconda does the trick. Plus, you’re going to have five or six versions of Anaconda installed on the machine regardless, as it seems like every other app these days wants its own version installed.
By the way, when installing Anaconda, there’s a check box about adding Anaconda to your PATH. Check that box even though they recommend you not do so. I needed to have the box checked to get R Services to identify Anaconda properly.
Double-Check that PATH
I installed Anaconda to C:\Anaconda, so keep that in mind. Make sure that you have the following in your PATH system environment variable:
Then, add a new system environment variable:
Turning Off All of the Security
This next part may be a little scandalous, but my recollection is that I needed to do this to get everything working. I don’t remember if it was to get around our company’s permissions setup or if ML Services needed it set, but I had to set the Anaconda folder’s permissions recursively to grant Full Control to Users. You can try doing this without making that change and hopefully it works, but again, trial and error told me I needed to do this on my end. I also needed to grant All Application Packages and All Restricted Packages full control over the Anaconda folder. If you run
icacls on the folder you’ll see something like the following:
After you’ve set up the environment variables and rights, restart the SQL Server instance, both the database engine and the Launchpad service. This way they’ll pick up the new path variables.
Step Two: Install Keras
Now that we have ML Services installed and a good Python installation for Keras to live, we need to run the R shell. For SQL Server 2019, that’s
%PROGRAMFILES%\Microsoft SQL Server\MSSQL15.MSSQLSERVER\R_SERVICES\bin\R.exe by default, but you may need to change the path based on your installation location, version of SQL Server, and whether this is a named instance.
Inside R, run the following:
install.packages("reticulate", repos = "http://archive.linux.duke.edu/cran/") install.packages("tensorflow", repos = "http://archive.linux.duke.edu/cran/") install.packages("keras", repos = "http://archive.linux.duke.edu/cran/")
Note that I specified a repo here. If you don’t specify a repo, you’ll get Microsoft’s CRAN repo for SQL Server 2019, which is way out of date. We need current versions of
keras for this to work. I use Duke University’s archive because it’s close, and Case Western University’s when Duke gives me troubles, but you can also find your own favorite mirror.
What we’ve done so far is install Python support (
reticulate), as well as the shells for TensorFlow and Keras. We need to hook them up next. If you read the R-Keras installation instructions, you’ll see that there is a way to configure from within R. I was able to get that working for 2016 and 2017, but 2019 gave me problems, so we’re trying a different route.
Step Three: a Pivot to Python
Open up an Anaconda prompt (though I don’t recall needing to do so as Administrator, probably because I already gave away the store in Step One) and run the following command:
conda update -n base -c defaults conda
What we’re doing here is updating the conda package itself on our base environment using the default channel. It may not be strictly necessary, but usually conda will be a little out of date when you install Anaconda on Windows, so it’s worth getting the latest version.
Then close and re-open your Anaconda prompt. After that, run the following:
conda create --name r-reticulate conda activate r-reticulate pip install tensorflow-probability pip install tensorflow pip install keras conda activate base pip install tensorflow-probability pip install tensorflow pip install keras
What I’m doing is building a new virtual environment named
r-reticulate, which is what the
reticulate package in R desires. Inside that virtual environment, I’m installing the latest versions of
tensorflow , and
keras. I had DLL loading problems with TensorFlow 2.1 on Windows, so if you run into those, the proper solution is to ensure that you have the appropriate Visual C++ redistributables installed on your server.
Then, I switched back to the base virtual environment and installed the same packages. My thinking here is that I’ll probably need them for other stuff as well (and don’t tell anybody, but I’m not very good with Python environments).
Step Four: Test Things Out
Bounce back to R.exe and run the following commands:
If those worked, congratulations! You have TensorFlow for CPU and Keras installed. If you need GPU support, you’ll need to configure that on your own; I got it working on a desktop in the past but between CUDA, TensorFlow, Keras, and R, it’s a lot of pain.
From here, you should be able to open up your SQL Server client of choice, connect to the SQL Server instance, and run the following:
EXEC sys.sp_execute_external_script @language = N'R', @script = N'reticulate::import("h5py") reticulate::import("keras")'
This will help ensure that ML Services is working as expected.
Bonus: Installing TensorFlow and Keras for Python on Python Services
If you just want to use Python to execute TensorFlow and Keras code and don’t want to tie it into R at all, the steps are a lot easier. That’s where my talk on Developing a Solution with SQL Server Machine Learning Services comes into play. We’re going to use
sqlmlutils to install Python.
sqlmlutils on your local machine. This does not need to be the same as the server where you’ve installed SQL Server Machine Learning Services. To do this, open up command prompt, run Python (oh, you need Python, and I’d probably recommend installing Anaconda locally as well), and run the following:
pip install sqlmlutils
Once you have
sqlmlutils installed, we can continue on. For each database in which you need to access Keras and TensorFlow, run the following:
# Connect to the Database # NOTE: need to do this for *each* database! #conn = sqlmlutils.ConnectionInfo(server="MyGreatServer", database="SomeDatabase") sqlpy = sqlmlutils.SQLPythonExecutor(conn) pkgmanager = sqlmlutils.SQLPackageManager(conn) # Install a package as dbo, allowing anybody to use it. # Upgrade if there is a newer version than what is already installed. pkgmanager.install("tensorflow", scope = Scope.public_scope()) pkgmanager.install("keras", scope = Scope.public_scope())
Yeah, that was a lot easier. On the plus side, the method we followed to install Keras and TensorFlow for R allows us to use it in every database by default, rather than the per-database installation method for
Bonus Bonus: TensorFlow and Keras on R with sqlmlutils
I want to take a quick moment at the end of the post to explain why, although
sqlmlutils works for installing TensorFlow and Keras against Python Services, I needed to run R.exe on the server for R Services.
I tried doing this all from
sqlmlutils, but I ran into two problems. The first was in installing the
keras packages. They would start running but would never actually finish. I let this run overnight just to see if it’d complete, and it never did. By contrast, installing them from the server worked every time.
Second, I was unable to run
install_keras(method = "conda", conda = "C:\\Anaconda\\Scripts\\conda.exe) successfully, either via
sqlmlutils or R.exe. That’s why I ended up creating the virtual environment and installing TensorFlow-related packages separately.