This is part four in a series on low-code machine learning with Azure ML.
The Machine Learning Process
Machine learning is a lot like an action film from the 1980s: we see early on that there’s a problem, we train in a cool montage with upbeat rock music, and then we come back to the problem and defeat it with car chases and bazookas and quippy one-liners. Well, maybe that simile got away from me a little bit, but I think I’ll stick with it.
What we’ll do in this post is cover the process of training a simple model using the Azure ML designer. I won’t deviate too far from the “classic” Azure ML script, which involves using the Designer to train a model and then deploy an endpoint for consumption. And away we go!
Build a Training Pipeline
All of this happens in the Designer view. Navigate to the Designer page, which you can find in the Author menu. Then, select the giant plus sign to create a new model.
For training, we need to select a compute instance, a compute cluster, or attached compute. I’m going to choose my compute instance rather than a compute cluster. In order to do this, my compute instance must be on, so if you’re following along and yours isn’t, turn it on in the Compute menu before moving forward. We’ll write our outputs to the default of
workplaceblobstore and give the training process a jaunty name.
From there, we’re going to create a training pipeline. The standard training pipeline looks a bit like this:
You may see a few minor alterations, but most Azure ML pipelines will start with a frame that looks like this. Now let’s take each piece of it in order.
The first component is our dataset. You’ll every dataset you’ve created in the Datasets menu. As a quick tip, use the search bar above the Datasets menu to search for items, especially as the number of datasets gets large. This search also works for different ML algorithms and the like.
Clean Missing Data
The next step in the process is to add a Clean Missing Data transformation, which you can find in the Data Transformation menu.
You’ll select the “Edit column” to choose the columns you need. You can set the minimum and maximum missing value ratios. I’d leave minimum at 0.0, though if you want to cap the maximum percentage of missing values that you clean up, you can set the maximum value under 1. Just like with AutoML, we’ll replace missing values with the mean.
Note that Clean Missing Data takes one input called Dataset of type DataFrameDirectory and has two outputs: a DataFrameDirectory called Cleaned dataset, and a TransformationDirectory called Cleaning transformation. I don’t think I’ve used a TransformationDirectory before and looking it up by that name doesn’t really give you very much info.
The next step will be a Normalize Data component, also in the Data Transformation menu. What we typically mean by normalization in this context is to have each input feature range between 0 and 1. That way, numerically large features will not swamp numerically small features, as with many algorithms, a feature which ranges from 100,000 to 25,000,000 may have a lot more weight than a feature which ranges from 0 to 3, even if the 0-3 feature is actually more important.
We’ll transform each of the input columns along these lines.
Now we’re going to split our data. When we ran through the AutoML problem, it automatically used cross-validation because it was going to try a variety of hyperparameters. We’re going to stick to a simpler Train-Test split, meaning that, in our case, 70% of data will go to training and 30% to test. You can find the Split Data component in the Data Transformation menu.
There are two outputs for the Split Data component. These represent the two DataFrameDirectory outputs we will create: one that we can use for training and the other we can use for testing. Note that you cannot have a three-way split, though if you stack these components, you can effectively create one, so if we wanted a 70-20-10 Train-Validate-Test split, we might create a 70-30 like this one, and then chain another Split Data to the right-most output node. The second Split Data would have a 67-33 split, as we’re breaking 30% down into 20 and 10.
Choosing an Algorithm
The next step is to use one of the built-in algorithms in Azure ML. You can find all relevant classification algorithms in the Machine Learning Algorithms menu. There are several from which we can choose, but I’ll pull MultiClass Boosted Decision Tree, not just because it’s the first one on the list, but also because boosted decision trees tend to work pretty well. Most algorithms have a number of hyperparameters, that is tunable parameters on the algorithm itself. If you’re familiar with the literature on a particular algorithm, you can tweak these and end up with considerably better results than the base. If you aren’t familiar, the base does often provide you a reasonable starting point.
Note that the algorithm doesn’t take any inputs. It has just one output: an UntrainedModelDirectory. This means that we need another component to perform the training.
Training a Model
The next component is the Train Model component, which you can find in the Model Training menu. This component takes in an UntrainedModelDirectory and a DataFrameDirectory and generates a trained model of type ModelDirectory. For our model training, we want to choose the label, or the thing we are trying to predict. In this case, we’re attempting to predict the species, and so we’ll choose that. We also have the ability to turn on model explanations, so that we can understand which features were most important in deciding what kind of penguin this is.
Checking Model Quality
The next component we need is the Score Model component, which we can find in the Model Scoring & Evaluation menu. This takes in two things: a ModelDirectory and a DataFrameDirectory. Its purpose is to take the trained model and expose it to new data that the model has not seen. Then, we will determine how well that model did.
Evaluating the Model
The final step in our pipeline is an evaluation step. Specifically, the Evaluate Model component, which you can also find in the Model Scoring & Evaluation menu. This component takes in two inputs: a DataFrameDirectory with scored output data, and an optional second DataFrameDirectory, which you can use if you’d like to compare two models and choose the better one.
The evaluation component also outputs a DataFrameDirectory, which includes evaluation metrics.
Now we have a training plan in place, so let’s select Submit and get this training going! We have the opportunity to select an existing experiment or create a new one. I’m going to select the existing
penguin-classifier experiment, so that we can compare how we do here with the AutoML version from the prior post.
This training may take a while. As each component starts, you’ll see a Running indicator. When components complete, you’ll see the status indicator change to read Completed, assuming you did everything right.
Note that each component tends to run individually—this is not really a streaming solution, so we perform one operation at a time and move the data to the next step. That means that long pipelines with lots of data may take longer than alternative solutions based in Python, R, or some other programming language. On the plus side, however, we will get a good deal of log information out of it and it’s not like you’re the one doing all of the work here…
Anyhow, once everything is complete, we can right-click on Score Model and choose Scored_dataset from the Preview data menu.
From there, you’ll get a fly-out panel with each of the inputs, the actual species, the model’s estimation of likelihood for each possible class, and the models’ final judgment on class. In many cases, the model is supremely confident in the answer; in other cases, like the one I’ve highlighted, you can see it’s not quite as clear-cut.
This micro-level analysis is useful, but if you right-click on Evaluate Model and preview the evaluation results, you’ll see how our model did.
It’s not as good as what AutoML came up with, but considering that we did zero hyperparameter tuning and just dragged and dropped some stuff, getting 96% accuracy and 97% precision indicates that it’s pretty easy to separate these penguins.
In today’s post, we built a training pipeline. In the next post, we are going to see how we can productionalize this and see what the deployment options look like.