Exploring search results¶
After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.
[1]:
import evalml
from evalml import AutoMLSearch
X, y = evalml.demos.load_breast_cancer()
automl = AutoMLSearch(problem_type='binary',
objective="f1",
max_pipelines=5)
automl.search(X, y)
Generating pipelines to search over...
*****************************
* Beginning pipeline search *
*****************************
Optimizing for F1.
Greater score is better.
Searching up to 5 pipelines.
Allowed model families: xgboost, catboost, random_forest, linear_model
✔ Mode Baseline Binary Classification... 0%| | Elapsed:00:00
✔ CatBoost Classifier w/ Simple Imput... 20%|██ | Elapsed:00:22
✔ Logistic Regression Classifier w/ S... 40%|████ | Elapsed:00:23
✔ Random Forest Classifier w/ Simple ... 60%|██████ | Elapsed:00:25
▹ XGBoost Classifier w/ Simple Imputer: 80%|████████ | Elapsed:00:25[21:04:10] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
[21:04:10] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-evalml/envs/v0.11.0/lib/python3.7/site-packages/xgboost/sklearn.py:888: UserWarning:
The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-evalml/envs/v0.11.0/lib/python3.7/site-packages/xgboost/sklearn.py:888: UserWarning:
The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
[21:04:10] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
✔ XGBoost Classifier w/ Simple Imputer: 80%|████████ | Elapsed:00:25
✔ Optimization finished 80%|████████ | Elapsed:00:25
/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-evalml/envs/v0.11.0/lib/python3.7/site-packages/xgboost/sklearn.py:888: UserWarning:
The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
View Rankings¶
A summary of all the pipelines built can be returned as a pandas DataFrame. It is sorted by score. EvalML knows based on our objective function whether higher or lower is better.
[2]:
automl.rankings
[2]:
id | pipeline_name | score | high_variance_cv | parameters | |
---|---|---|---|---|---|
0 | 2 | Logistic Regression Classifier w/ Simple Imput... | 0.980447 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
1 | 1 | CatBoost Classifier w/ Simple Imputer | 0.976333 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
2 | 4 | XGBoost Classifier w/ Simple Imputer | 0.970577 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
3 | 3 | Random Forest Classifier w/ Simple Imputer | 0.966629 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
4 | 0 | Mode Baseline Binary Classification Pipeline | 0.771060 | False | {'Baseline Classifier': {'strategy': 'random_w... |
Describe Pipeline¶
Each pipeline is given an id
. We can get more information about any particular pipeline using that id
. Here, we will get more information about the pipeline with id = 0
.
[3]:
automl.describe_pipeline(1)
*****************************************
* CatBoost Classifier w/ Simple Imputer *
*****************************************
Problem Type: Binary Classification
Model Family: CatBoost
Pipeline Steps
==============
1. Simple Imputer
* impute_strategy : most_frequent
* fill_value : None
2. CatBoost Classifier
* n_estimators : 1000
* eta : 0.03
* max_depth : 6
* bootstrap_type : None
Training
========
Training for Binary Classification problems.
Total training time (including CV): 22.9 seconds
Cross Validation
----------------
F1 Accuracy Binary Balanced Accuracy Binary Precision AUC Log Loss Binary MCC Binary # Training # Testing
0 0.967 0.958 0.949 0.951 0.995 0.106 0.910 379.0 190.0
1 0.983 0.979 0.975 0.975 0.994 0.082 0.955 379.0 190.0
2 0.979 0.974 0.976 0.991 0.990 0.093 0.944 380.0 189.0
mean 0.976 0.970 0.967 0.973 0.993 0.094 0.936 - -
std 0.008 0.011 0.015 0.020 0.003 0.012 0.024 - -
coef of var 0.009 0.011 0.016 0.021 0.003 0.128 0.025 - -
Get Pipeline¶
We can get the object of any pipeline via their id
as well:
[4]:
automl.get_pipeline(1)
[4]:
<evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline at 0x7f26597c0850>
Get best pipeline¶
If we specifically want to get the best pipeline, there is a convenient access
[5]:
automl.best_pipeline
[5]:
<evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline at 0x7f26d8cfc390>
Feature Importance¶
We can get the importance associated with each feature of the resulting pipeline
[6]:
pipeline = automl.get_pipeline(1)
pipeline.fit(X, y)
pipeline.feature_importance
[6]:
feature | importance | |
---|---|---|
0 | worst texture | 11.284484 |
1 | worst concave points | 8.105820 |
2 | worst perimeter | 7.989750 |
3 | worst radius | 7.872959 |
4 | worst area | 7.851859 |
5 | mean concave points | 7.098117 |
6 | mean texture | 5.919744 |
7 | worst smoothness | 4.831183 |
8 | worst concavity | 4.665609 |
9 | area error | 4.204315 |
10 | compactness error | 2.616379 |
11 | worst symmetry | 2.088257 |
12 | mean concavity | 2.015011 |
13 | radius error | 1.948857 |
14 | concave points error | 1.875384 |
15 | mean compactness | 1.824212 |
16 | perimeter error | 1.705293 |
17 | mean smoothness | 1.697390 |
18 | worst fractal dimension | 1.629284 |
19 | mean radius | 1.585845 |
20 | mean area | 1.479429 |
21 | smoothness error | 1.460370 |
22 | fractal dimension error | 1.370231 |
23 | mean perimeter | 1.306400 |
24 | texture error | 1.065599 |
25 | mean fractal dimension | 1.000343 |
26 | worst compactness | 0.967447 |
27 | mean symmetry | 0.960967 |
28 | symmetry error | 0.912462 |
29 | concavity error | 0.666999 |
We can also create a bar plot of the feature importances
[7]:
pipeline.graph_feature_importance()
Precision-Recall Curve¶
For binary classification, you can view the precision-recall curve of a classifier
[8]:
# get the predicted probabilities associated with the "true" label
y_pred_proba = pipeline.predict_proba(X)[1]
evalml.pipelines.graph_utils.graph_precision_recall_curve(y, y_pred_proba)
ROC Curve¶
For binary and multiclass classification, you can view the ROC curve of a classifier
[9]:
# get the predicted probabilities associated with the "true" label
y_pred_proba = pipeline.predict_proba(X)[1]
evalml.pipelines.graph_utils.graph_roc_curve(y, y_pred_proba)
Confusion Matrix¶
For binary or multiclass classification, you can view a confusion matrix of the classifier’s predictions
[10]:
y_pred = pipeline.predict(X)
evalml.pipelines.graph_utils.graph_confusion_matrix(y, y_pred)
Access raw results¶
You can also get access to all the underlying data, like this:
[11]:
automl.results
[11]:
{'pipeline_results': {0: {'id': 0,
'pipeline_name': 'Mode Baseline Binary Classification Pipeline',
'pipeline_class': evalml.pipelines.classification.baseline_binary.ModeBaselineBinaryPipeline,
'pipeline_summary': 'Baseline Classifier',
'parameters': {'Baseline Classifier': {'strategy': 'random_weighted'}},
'score': 0.7710601157203097,
'high_variance_cv': False,
'training_time': 0.03474116325378418,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.7702265372168284),
('Accuracy Binary', 0.6263157894736842),
('Balanced Accuracy Binary', 0.5),
('Precision', 0.6263157894736842),
('AUC', 0.5),
('Log Loss Binary', 0.6608932451679239),
('MCC Binary', 0.0),
('# Training', 379),
('# Testing', 190)]),
'score': 0.7702265372168284,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.7702265372168284),
('Accuracy Binary', 0.6263157894736842),
('Balanced Accuracy Binary', 0.5),
('Precision', 0.6263157894736842),
('AUC', 0.5),
('Log Loss Binary', 0.6608932451679239),
('MCC Binary', 0.0),
('# Training', 379),
('# Testing', 190)]),
'score': 0.7702265372168284,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.7727272727272727),
('Accuracy Binary', 0.6296296296296297),
('Balanced Accuracy Binary', 0.5),
('Precision', 0.6296296296296297),
('AUC', 0.5),
('Log Loss Binary', 0.6591759924082952),
('MCC Binary', 0.0),
('# Training', 380),
('# Testing', 189)]),
'score': 0.7727272727272727,
'binary_classification_threshold': 0.5}]},
1: {'id': 1,
'pipeline_name': 'CatBoost Classifier w/ Simple Imputer',
'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
'pipeline_summary': 'CatBoost Classifier w/ Simple Imputer',
'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
'fill_value': None},
'CatBoost Classifier': {'n_estimators': 1000,
'eta': 0.03,
'max_depth': 6,
'bootstrap_type': None}},
'score': 0.9763329621163277,
'high_variance_cv': False,
'training_time': 22.883776903152466,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9669421487603305),
('Accuracy Binary', 0.9578947368421052),
('Balanced Accuracy Binary', 0.9493431175287016),
('Precision', 0.9512195121951219),
('AUC', 0.9945555687063559),
('Log Loss Binary', 0.10583268649418161),
('MCC Binary', 0.909956827190137),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9669421487603305,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.9833333333333334),
('Accuracy Binary', 0.9789473684210527),
('Balanced Accuracy Binary', 0.9746715587643509),
('Precision', 0.9752066115702479),
('AUC', 0.9943188543022844),
('Log Loss Binary', 0.08186397218927995),
('MCC Binary', 0.955011564828661),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9833333333333334,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.9787234042553192),
('Accuracy Binary', 0.9735449735449735),
('Balanced Accuracy Binary', 0.9760504201680673),
('Precision', 0.9913793103448276),
('AUC', 0.9899159663865547),
('Log Loss Binary', 0.09296190874649334),
('MCC Binary', 0.9443109474170326),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9787234042553192,
'binary_classification_threshold': 0.5}]},
2: {'id': 2,
'pipeline_name': 'Logistic Regression Classifier w/ Simple Imputer + Standard Scaler',
'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
'pipeline_summary': 'Logistic Regression Classifier w/ Simple Imputer + Standard Scaler',
'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
'fill_value': None},
'Logistic Regression Classifier': {'penalty': 'l2',
'C': 1.0,
'n_jobs': -1}},
'score': 0.9804468499596668,
'high_variance_cv': False,
'training_time': 1.0504083633422852,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9831932773109243),
('Accuracy Binary', 0.9789473684210527),
('Balanced Accuracy Binary', 0.9775121316132087),
('Precision', 0.9831932773109243),
('AUC', 0.9936087110900698),
('Log Loss Binary', 0.09347817517438463),
('MCC Binary', 0.9550242632264173),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9831932773109243,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.9794238683127572),
('Accuracy Binary', 0.9736842105263158),
('Balanced Accuracy Binary', 0.9647887323943662),
('Precision', 0.9596774193548387),
('AUC', 0.9975144987572493),
('Log Loss Binary', 0.08320464479579018),
('MCC Binary', 0.9445075449666159),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9794238683127572,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.9787234042553192),
('Accuracy Binary', 0.9735449735449735),
('Balanced Accuracy Binary', 0.9760504201680673),
('Precision', 0.9913793103448276),
('AUC', 0.9906362545018007),
('Log Loss Binary', 0.09680859555948443),
('MCC Binary', 0.9443109474170326),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9787234042553192,
'binary_classification_threshold': 0.5}]},
3: {'id': 3,
'pipeline_name': 'Random Forest Classifier w/ Simple Imputer',
'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
'pipeline_summary': 'Random Forest Classifier w/ Simple Imputer',
'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
'fill_value': None},
'Random Forest Classifier': {'n_estimators': 100,
'max_depth': 6,
'n_jobs': -1}},
'score': 0.9666291581686476,
'high_variance_cv': False,
'training_time': 1.4155998229980469,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9543568464730291),
('Accuracy Binary', 0.9421052631578948),
('Balanced Accuracy Binary', 0.9338975026630371),
('Precision', 0.9426229508196722),
('AUC', 0.9893478518167831),
('Log Loss Binary', 0.13984688783161608),
('MCC Binary', 0.8757606542930872),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9543568464730291,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.9709543568464729),
('Accuracy Binary', 0.9631578947368421),
('Balanced Accuracy Binary', 0.9563853710498283),
('Precision', 0.9590163934426229),
('AUC', 0.989347851816783),
('Log Loss Binary', 0.12010721015394274),
('MCC Binary', 0.9211492315750531),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9709543568464729,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.9745762711864406),
('Accuracy Binary', 0.9682539682539683),
('Balanced Accuracy Binary', 0.9689075630252101),
('Precision', 0.9829059829059829),
('AUC', 0.9927971188475391),
('Log Loss Binary', 0.10765634363120971),
('MCC Binary', 0.9325680982740896),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9745762711864406,
'binary_classification_threshold': 0.5}]},
4: {'id': 4,
'pipeline_name': 'XGBoost Classifier w/ Simple Imputer',
'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
'pipeline_summary': 'XGBoost Classifier w/ Simple Imputer',
'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
'fill_value': None},
'XGBoost Classifier': {'eta': 0.1,
'max_depth': 6,
'min_child_weight': 1,
'n_estimators': 100}},
'score': 0.9705772481706093,
'high_variance_cv': False,
'training_time': 0.4369237422943115,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9666666666666667),
('Accuracy Binary', 0.9578947368421052),
('Balanced Accuracy Binary', 0.9521836903775595),
('Precision', 0.9586776859504132),
('AUC', 0.9915966386554622),
('Log Loss Binary', 0.11449876085695762),
('MCC Binary', 0.9097672817424011),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9666666666666667,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.979253112033195),
('Accuracy Binary', 0.9736842105263158),
('Balanced Accuracy Binary', 0.9676293052432241),
('Precision', 0.9672131147540983),
('AUC', 0.9959758551307847),
('Log Loss Binary', 0.07421583775339011),
('MCC Binary', 0.943843520216036),
('# Training', 379),
('# Testing', 190)]),
'score': 0.979253112033195,
'binary_classification_threshold': 0.5},
{'all_objective_scores': OrderedDict([('F1', 0.9658119658119659),
('Accuracy Binary', 0.9576719576719577),
('Balanced Accuracy Binary', 0.9605042016806722),
('Precision', 0.9826086956521739),
('AUC', 0.9885954381752701),
('Log Loss Binary', 0.11418110851220609),
('MCC Binary', 0.9112159507396058),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9658119658119659,
'binary_classification_threshold': 0.5}]}},
'search_order': [0, 1, 2, 3, 4]}