This is part three in a series on getting beyond the basics with Azure ML.
The Python SDK
Over the past two posts, we have started using the Azure Machine Learning SDK for Python but I’ve only touched on the topic. In this post, we are going to dive into the topic.
Installing the Python SDK
If you spin up a compute instance in Azure ML, you will have the Python SDK automatically. You can also install it using pip:
pip install azureml-core will install the core libraries. My general recommendation is that you probably want to start with the Anaconda distribution of Python. This comes with a large number of packages pre-installed, including some of the packages which Azure ML requires.
That said, there are other Azure ML packages which you might wish to install as well. For the most part, you don’t often need to install most of these packages. They are available if you need them, however.
Getting Started with Documentation
A common point of frustration I should note is that the documentation ranges from excellent to auto-generated garbage. Typically, the high-level concepts and most straightforward paths tend to be well-documented. As you get further into the weeds, the documentation becomes much less useful.
Getting Started with the SDK
For the most part, I’d recommend starting with the Azure ML tutorials. These will show you how to use components in Azure ML and tend to be really good.
You might also look at the Azure ML examples for the Python SDK. This GitHub repo includes a variety of project examples, though there are a few reasons you might have difficulty running some of these tests. First, some of the tests have rather specific requirements about what is installed, the operating system you are using, and so forth. The code might be useful as a reference but you’ll definitely want to test, read, and refer back to the documentation.
The other problem that I’ve seen a few times is that new versions of the Azure ML SDK might have breaking changes or bugs. For example, when working with MLOps (a topic for later in this series!), I ran into problems with certain versions of the Azure ML SDK, needing to go back to a specific version because otherwise, certain jobs would not work. Just like any piece of software, there will be bugs. If you do run into a bug, you might want to try specifying a different version of the Azure ML SDK. If you have an older version, it’s also worth upgrading to the latest version before going too far, as sometimes you’ll run into an issue that has already been fixed. In short, keep those packages up to date—something I admittedly am not the greatest at…
Are There SDKs in Other Languages?
No. And that hurts.
There used to be an Azure ML SDK for R. It’s dead. Instead, we’re to use the Azure ML 2.0 CLI. This CLI is still in preview and documentation is pretty slim at this point. There are some CLI-based examples that you can use to get started and a bit of documentation on the topic. That said, It’s a different world and I think it adds a fair amount of mental burden on data scientists over using an existing SDK. In the long run, I think the CLI version will end up being more powerful than the old R SDK or the existing Python SDK but will have a steeper learning curve given that we need to incorporate Docker, YAML, and Python/R.
In this post, I went a little bit further into the Azure ML SDK. My hope here is to give you enough familiarity to get going with the library, as we will use it throughout the rest of this series. I’ll make sure that we have enough detail about each component as we use it but if you do want to dive more deeply into it, check out the links and resources in this post for more information.