23.5 Designer: Selecting Algorithm and Features
Algorithm Selection
Regression Model Overview
The AMLS Designer offers several of the supervised regression algorithms listed in the prior section. Below is a summary of those available in the AMLS Designer:
Linear Regression: fast to train, assumes a linear model
Decision Forest Regression: great combination of accuracy and fast training
Boosted Decision Tree Regression: still fast and accurate, but requires large memory footprint
Neural Network Regression: usually more accurate than others, slow to train, important to set the hyper-parameters correctly to get it to work well
Poisson Regression: predicts event counts, must be non-negative numbers
Followo along with the video below to learn how to use these algorithms in the AMLS Designer:
Classification Algorithm Overview
Microsoft has implemented a nice variety of two-class and multi-class algorithms for us to use in the AMLS Designer. Follow along with the videos below to learn how to use them. We will use the bikebuyers.csvNot Found dataset for all classification algorithm examples.
Feature Selection
The Computes the permutation feature importance scores of feature variables given a trained model and a test dataset. is also meant to help you select the best features. However, its advantages and disadvantages are the opposite of the filter-based feature selection pill. Follow along with the video below and see if you can identify the high-level advantages and disadvantages compared to the filter-based feature selection pill and consider in what situations it may be useful.
Once again, let's review the important points from the video:
PFI description
Calculated after model training, PFI is a measure of how much each feature is contributing to the overall model fit. The process includes:
Train the model on the original dataset with all features included
Calculate baseline model fit (e.g. accuracy, R2, MAE, RMSE, etc.)
Permute the first feature by randomly shiffling its values within the validation set; essentially, breaking the relationship of the feature with the label within the overall model
Evaluate the overall model fit again after permutation
Calculate the importance of that feature as the difference in model fit before and after the feature permutation
Repeat steps 3-5 for each feature in the model
Rank sort the features in order of importance
Positive PFI scores indicate that the feature is contributing to model performance
Negative PFI scores indicate that the feature is leading to significant overfitting
Positive scores that are quite close to zero relative to the higher positive scores may also be removed; this is a judgement call based on domain knowledge/theory.
Advantages
Based on a trained model; therefore, it takes into consideration the intercorrelation among features for a more accurate indication of feature importance
Disdvantages
Based on a trained model; therefore, it takes much longer to calculate
As implemented in AMLS Designer, it cannot be implemented to dynamically select features before model fitting making it unusable for scenarios where the top n features changes significantly over time
Metrics for measuring performance: this determines the scale upon which the PFI score is evaluated
Classification model metrics
Accuracy
Precision
Recall
Regression model metrics
Mean Absolute Error (MAE)
Root Mean Squared Error (RMSE)
Relative Absolute Error
Relative Squared Error
Coefficient of Determination (R2)
Here's a question--will you ever use the permutation feature importance pill at the same time as the filter-based feature selection pill? Technically you can, and that may be useful in some cases, but typically not. The only time it would be useful is if you want to narrow a large set of features down to a smaller set (using permutation feature importance), but you don't want to eliminate ALL features with lesser value to your model because you think the importance of the remaining features will change over time.
Assessment
Complete the assessment below: