I wanted to cover something which has bitten me in two separate ways regarding SQL Server Machine Learning Services and Resource Governor.
Resource Governor and Default Memory
If you install a brand new copy of SQL Server and enable SQL Server Machine Learning Services, you’ll want to look at
By default, SQL Server will grant 20% of available memory to any R or Python scripts running. The purpose of this limit is to prevent you from hurting server performance with expensive external scripts (like, say, training large neural networks on a SQL Server).
Here’s the kicker: this affects you even if you don’t have Resource Governor enabled. If you see out-of-memory exceptions in Python or error messages about memory allocation in R, I’d recommend bumping this max memory percent up above 20, and I have scripts to help you with the job. Of course, making this change assumes that your server isn’t stressed to the breaking point; if it is, you might simply want to offload that work somewhere else.
Resource Governor and CPU
Notice that by default, the max CPU percent for external pools is 100, meaning that we get to push the server to its limits with respect to CPU.
Well, what happens if you accidentally change that? I found out the answer the hard way!
In my case, our servers were accidentally scaled down to 1% max CPU utilization. The end result was that even something as simple as
print("Hello") in either R or Python would fail after 30 seconds. I thought it had to do with the Launchpad service causing problems, but after investigation, this was the culprit.
The trickiest part about diagnosing this was that the Launchpad logs error messages gave no indication what the problem was—the error message was a vague “could not connect to Launchpad” error and the Launchpad error logs didn’t have any messages about the failed queries. So that’s one more thing to keep in mind when troubleshooting Machine Learning Services failures.