Exploring search results¶

After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.

[1]:

import evalml
from evalml import AutoMLSearch

X, y = evalml.demos.load_breast_cancer()

automl = AutoMLSearch(problem_type='binary',
                      objective="f1",
                      max_pipelines=5)

automl.search(X, y)

Generating pipelines to search over...
*****************************
* Beginning pipeline search *
*****************************

Optimizing for F1.
Greater score is better.

Searching up to 5 pipelines.
Allowed model families: xgboost, catboost, random_forest, linear_model

✔ Mode Baseline Binary Classification...     0%|          | Elapsed:00:00
✔ CatBoost Classifier w/ Simple Imput...    20%|██        | Elapsed:00:22
✔ Logistic Regression Classifier w/ S...    40%|████      | Elapsed:00:23
✔ Random Forest Classifier w/ Simple ...    60%|██████    | Elapsed:00:25
▹ XGBoost Classifier w/ Simple Imputer:     80%|████████  | Elapsed:00:25[21:04:10] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
[21:04:10] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.

/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-evalml/envs/v0.11.0/lib/python3.7/site-packages/xgboost/sklearn.py:888: UserWarning:

The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].

/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-evalml/envs/v0.11.0/lib/python3.7/site-packages/xgboost/sklearn.py:888: UserWarning:

The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].

[21:04:10] WARNING: ../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'binary:logistic' was changed from 'error' to 'logloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
✔ XGBoost Classifier w/ Simple Imputer:     80%|████████  | Elapsed:00:25
✔ Optimization finished                     80%|████████  | Elapsed:00:25

/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-evalml/envs/v0.11.0/lib/python3.7/site-packages/xgboost/sklearn.py:888: UserWarning:

The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].

View Rankings¶

A summary of all the pipelines built can be returned as a pandas DataFrame. It is sorted by score. EvalML knows based on our objective function whether higher or lower is better.

[2]:

automl.rankings

[2]:

	id	pipeline_name	score	high_variance_cv	parameters
0	2	Logistic Regression Classifier w/ Simple Imput...	0.980447	False	{'Simple Imputer': {'impute_strategy': 'most_f...
1	1	CatBoost Classifier w/ Simple Imputer	0.976333	False	{'Simple Imputer': {'impute_strategy': 'most_f...
2	4	XGBoost Classifier w/ Simple Imputer	0.970577	False	{'Simple Imputer': {'impute_strategy': 'most_f...
3	3	Random Forest Classifier w/ Simple Imputer	0.966629	False	{'Simple Imputer': {'impute_strategy': 'most_f...
4	0	Mode Baseline Binary Classification Pipeline	0.771060	False	{'Baseline Classifier': {'strategy': 'random_w...

Describe Pipeline¶

Each pipeline is given an id. We can get more information about any particular pipeline using that id. Here, we will get more information about the pipeline with id = 0.

[3]:

automl.describe_pipeline(1)

*****************************************
* CatBoost Classifier w/ Simple Imputer *
*****************************************

Problem Type: Binary Classification
Model Family: CatBoost

Pipeline Steps
==============
1. Simple Imputer
         * impute_strategy : most_frequent
         * fill_value : None
2. CatBoost Classifier
         * n_estimators : 1000
         * eta : 0.03
         * max_depth : 6
         * bootstrap_type : None

Training
========
Training for Binary Classification problems.
Total training time (including CV): 22.9 seconds

Cross Validation
----------------
               F1  Accuracy Binary  Balanced Accuracy Binary  Precision   AUC  Log Loss Binary  MCC Binary # Training # Testing
0           0.967            0.958                     0.949      0.951 0.995            0.106       0.910      379.0     190.0
1           0.983            0.979                     0.975      0.975 0.994            0.082       0.955      379.0     190.0
2           0.979            0.974                     0.976      0.991 0.990            0.093       0.944      380.0     189.0
mean        0.976            0.970                     0.967      0.973 0.993            0.094       0.936          -         -
std         0.008            0.011                     0.015      0.020 0.003            0.012       0.024          -         -
coef of var 0.009            0.011                     0.016      0.021 0.003            0.128       0.025          -         -

Get Pipeline¶

We can get the object of any pipeline via their id as well:

[4]:

automl.get_pipeline(1)

[4]:

<evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline at 0x7f26597c0850>

Get best pipeline¶

If we specifically want to get the best pipeline, there is a convenient access

[5]:

automl.best_pipeline

[5]:

<evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline at 0x7f26d8cfc390>

Feature Importance¶

We can get the importance associated with each feature of the resulting pipeline

[6]:

pipeline = automl.get_pipeline(1)
pipeline.fit(X, y)
pipeline.feature_importance

[6]:

	feature	importance
0	worst texture	11.284484
1	worst concave points	8.105820
2	worst perimeter	7.989750
3	worst radius	7.872959
4	worst area	7.851859
5	mean concave points	7.098117
6	mean texture	5.919744
7	worst smoothness	4.831183
8	worst concavity	4.665609
9	area error	4.204315
10	compactness error	2.616379
11	worst symmetry	2.088257
12	mean concavity	2.015011
13	radius error	1.948857
14	concave points error	1.875384
15	mean compactness	1.824212
16	perimeter error	1.705293
17	mean smoothness	1.697390
18	worst fractal dimension	1.629284
19	mean radius	1.585845
20	mean area	1.479429
21	smoothness error	1.460370
22	fractal dimension error	1.370231
23	mean perimeter	1.306400
24	texture error	1.065599
25	mean fractal dimension	1.000343
26	worst compactness	0.967447
27	mean symmetry	0.960967
28	symmetry error	0.912462
29	concavity error	0.666999

We can also create a bar plot of the feature importances

[7]:

pipeline.graph_feature_importance()

Precision-Recall Curve¶

For binary classification, you can view the precision-recall curve of a classifier

[8]:

# get the predicted probabilities associated with the "true" label
y_pred_proba = pipeline.predict_proba(X)[1]
evalml.pipelines.graph_utils.graph_precision_recall_curve(y, y_pred_proba)

ROC Curve¶

For binary and multiclass classification, you can view the ROC curve of a classifier

[9]:

# get the predicted probabilities associated with the "true" label
y_pred_proba = pipeline.predict_proba(X)[1]
evalml.pipelines.graph_utils.graph_roc_curve(y, y_pred_proba)

Confusion Matrix¶

For binary or multiclass classification, you can view a confusion matrix of the classifier’s predictions

[10]:

y_pred = pipeline.predict(X)
evalml.pipelines.graph_utils.graph_confusion_matrix(y, y_pred)

Access raw results¶

You can also get access to all the underlying data, like this:

[11]:

automl.results

[11]:

{'pipeline_results': {0: {'id': 0,
   'pipeline_name': 'Mode Baseline Binary Classification Pipeline',
   'pipeline_class': evalml.pipelines.classification.baseline_binary.ModeBaselineBinaryPipeline,
   'pipeline_summary': 'Baseline Classifier',
   'parameters': {'Baseline Classifier': {'strategy': 'random_weighted'}},
   'score': 0.7710601157203097,
   'high_variance_cv': False,
   'training_time': 0.03474116325378418,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.7702265372168284),
                  ('Accuracy Binary', 0.6263157894736842),
                  ('Balanced Accuracy Binary', 0.5),
                  ('Precision', 0.6263157894736842),
                  ('AUC', 0.5),
                  ('Log Loss Binary', 0.6608932451679239),
                  ('MCC Binary', 0.0),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.7702265372168284,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.7702265372168284),
                  ('Accuracy Binary', 0.6263157894736842),
                  ('Balanced Accuracy Binary', 0.5),
                  ('Precision', 0.6263157894736842),
                  ('AUC', 0.5),
                  ('Log Loss Binary', 0.6608932451679239),
                  ('MCC Binary', 0.0),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.7702265372168284,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.7727272727272727),
                  ('Accuracy Binary', 0.6296296296296297),
                  ('Balanced Accuracy Binary', 0.5),
                  ('Precision', 0.6296296296296297),
                  ('AUC', 0.5),
                  ('Log Loss Binary', 0.6591759924082952),
                  ('MCC Binary', 0.0),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.7727272727272727,
     'binary_classification_threshold': 0.5}]},
  1: {'id': 1,
   'pipeline_name': 'CatBoost Classifier w/ Simple Imputer',
   'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
   'pipeline_summary': 'CatBoost Classifier w/ Simple Imputer',
   'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
     'fill_value': None},
    'CatBoost Classifier': {'n_estimators': 1000,
     'eta': 0.03,
     'max_depth': 6,
     'bootstrap_type': None}},
   'score': 0.9763329621163277,
   'high_variance_cv': False,
   'training_time': 22.883776903152466,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9669421487603305),
                  ('Accuracy Binary', 0.9578947368421052),
                  ('Balanced Accuracy Binary', 0.9493431175287016),
                  ('Precision', 0.9512195121951219),
                  ('AUC', 0.9945555687063559),
                  ('Log Loss Binary', 0.10583268649418161),
                  ('MCC Binary', 0.909956827190137),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9669421487603305,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.9833333333333334),
                  ('Accuracy Binary', 0.9789473684210527),
                  ('Balanced Accuracy Binary', 0.9746715587643509),
                  ('Precision', 0.9752066115702479),
                  ('AUC', 0.9943188543022844),
                  ('Log Loss Binary', 0.08186397218927995),
                  ('MCC Binary', 0.955011564828661),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9833333333333334,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.9787234042553192),
                  ('Accuracy Binary', 0.9735449735449735),
                  ('Balanced Accuracy Binary', 0.9760504201680673),
                  ('Precision', 0.9913793103448276),
                  ('AUC', 0.9899159663865547),
                  ('Log Loss Binary', 0.09296190874649334),
                  ('MCC Binary', 0.9443109474170326),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9787234042553192,
     'binary_classification_threshold': 0.5}]},
  2: {'id': 2,
   'pipeline_name': 'Logistic Regression Classifier w/ Simple Imputer + Standard Scaler',
   'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
   'pipeline_summary': 'Logistic Regression Classifier w/ Simple Imputer + Standard Scaler',
   'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
     'fill_value': None},
    'Logistic Regression Classifier': {'penalty': 'l2',
     'C': 1.0,
     'n_jobs': -1}},
   'score': 0.9804468499596668,
   'high_variance_cv': False,
   'training_time': 1.0504083633422852,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9831932773109243),
                  ('Accuracy Binary', 0.9789473684210527),
                  ('Balanced Accuracy Binary', 0.9775121316132087),
                  ('Precision', 0.9831932773109243),
                  ('AUC', 0.9936087110900698),
                  ('Log Loss Binary', 0.09347817517438463),
                  ('MCC Binary', 0.9550242632264173),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9831932773109243,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.9794238683127572),
                  ('Accuracy Binary', 0.9736842105263158),
                  ('Balanced Accuracy Binary', 0.9647887323943662),
                  ('Precision', 0.9596774193548387),
                  ('AUC', 0.9975144987572493),
                  ('Log Loss Binary', 0.08320464479579018),
                  ('MCC Binary', 0.9445075449666159),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9794238683127572,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.9787234042553192),
                  ('Accuracy Binary', 0.9735449735449735),
                  ('Balanced Accuracy Binary', 0.9760504201680673),
                  ('Precision', 0.9913793103448276),
                  ('AUC', 0.9906362545018007),
                  ('Log Loss Binary', 0.09680859555948443),
                  ('MCC Binary', 0.9443109474170326),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9787234042553192,
     'binary_classification_threshold': 0.5}]},
  3: {'id': 3,
   'pipeline_name': 'Random Forest Classifier w/ Simple Imputer',
   'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
   'pipeline_summary': 'Random Forest Classifier w/ Simple Imputer',
   'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
     'fill_value': None},
    'Random Forest Classifier': {'n_estimators': 100,
     'max_depth': 6,
     'n_jobs': -1}},
   'score': 0.9666291581686476,
   'high_variance_cv': False,
   'training_time': 1.4155998229980469,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9543568464730291),
                  ('Accuracy Binary', 0.9421052631578948),
                  ('Balanced Accuracy Binary', 0.9338975026630371),
                  ('Precision', 0.9426229508196722),
                  ('AUC', 0.9893478518167831),
                  ('Log Loss Binary', 0.13984688783161608),
                  ('MCC Binary', 0.8757606542930872),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9543568464730291,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.9709543568464729),
                  ('Accuracy Binary', 0.9631578947368421),
                  ('Balanced Accuracy Binary', 0.9563853710498283),
                  ('Precision', 0.9590163934426229),
                  ('AUC', 0.989347851816783),
                  ('Log Loss Binary', 0.12010721015394274),
                  ('MCC Binary', 0.9211492315750531),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9709543568464729,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.9745762711864406),
                  ('Accuracy Binary', 0.9682539682539683),
                  ('Balanced Accuracy Binary', 0.9689075630252101),
                  ('Precision', 0.9829059829059829),
                  ('AUC', 0.9927971188475391),
                  ('Log Loss Binary', 0.10765634363120971),
                  ('MCC Binary', 0.9325680982740896),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9745762711864406,
     'binary_classification_threshold': 0.5}]},
  4: {'id': 4,
   'pipeline_name': 'XGBoost Classifier w/ Simple Imputer',
   'pipeline_class': evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline,
   'pipeline_summary': 'XGBoost Classifier w/ Simple Imputer',
   'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
     'fill_value': None},
    'XGBoost Classifier': {'eta': 0.1,
     'max_depth': 6,
     'min_child_weight': 1,
     'n_estimators': 100}},
   'score': 0.9705772481706093,
   'high_variance_cv': False,
   'training_time': 0.4369237422943115,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9666666666666667),
                  ('Accuracy Binary', 0.9578947368421052),
                  ('Balanced Accuracy Binary', 0.9521836903775595),
                  ('Precision', 0.9586776859504132),
                  ('AUC', 0.9915966386554622),
                  ('Log Loss Binary', 0.11449876085695762),
                  ('MCC Binary', 0.9097672817424011),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9666666666666667,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.979253112033195),
                  ('Accuracy Binary', 0.9736842105263158),
                  ('Balanced Accuracy Binary', 0.9676293052432241),
                  ('Precision', 0.9672131147540983),
                  ('AUC', 0.9959758551307847),
                  ('Log Loss Binary', 0.07421583775339011),
                  ('MCC Binary', 0.943843520216036),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.979253112033195,
     'binary_classification_threshold': 0.5},
    {'all_objective_scores': OrderedDict([('F1', 0.9658119658119659),
                  ('Accuracy Binary', 0.9576719576719577),
                  ('Balanced Accuracy Binary', 0.9605042016806722),
                  ('Precision', 0.9826086956521739),
                  ('AUC', 0.9885954381752701),
                  ('Log Loss Binary', 0.11418110851220609),
                  ('MCC Binary', 0.9112159507396058),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9658119658119659,
     'binary_classification_threshold': 0.5}]}},
 'search_order': [0, 1, 2, 3, 4]}