Exploring search results¶

After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.

[1]:

import evalml
from evalml import AutoClassificationSearch

X, y = evalml.demos.load_breast_cancer()

automl = AutoClassificationSearch(objective="f1",
                                  max_pipelines=5)

automl.search(X, y)

*****************************
* Beginning pipeline search *
*****************************

Optimizing for F1.
Greater score is better.

Searching up to 5 pipelines.
Allowed model families: catboost, xgboost, random_forest, linear_model

✔ Mode Baseline Binary Classification...     0%|          | Elapsed:00:00
✔ Cat Boost Binary Classification Pip...    20%|██        | Elapsed:00:20
✔ Logistic Regression Binary Pipeline:      40%|████      | Elapsed:00:22
✔ Random Forest Binary Classification...    60%|██████    | Elapsed:00:23
✔ XGBoost Binary Classification Pipel...    80%|████████  | Elapsed:00:24
✔ Optimization finished                     80%|████████  | Elapsed:00:24

View Rankings¶

A summary of all the pipelines built can be returned as a pandas DataFrame. It is sorted by score. EvalML knows based on our objective function whether higher or lower is better.

[2]:

automl.rankings

[2]:

	id	pipeline_name	score	high_variance_cv	parameters
0	2	Logistic Regression Binary Pipeline	0.982019	False	{'One Hot Encoder': {'top_n': 10}, 'Simple Imp...
1	1	Cat Boost Binary Classification Pipeline	0.976169	False	{'Simple Imputer': {'impute_strategy': 'most_f...
2	4	XGBoost Binary Classification Pipeline	0.970716	False	{'One Hot Encoder': {'top_n': 10}, 'Simple Imp...
3	3	Random Forest Binary Classification Pipeline	0.968074	False	{'One Hot Encoder': {'top_n': 10}, 'Simple Imp...
4	0	Mode Baseline Binary Classification Pipeline	0.771060	False	{'strategy': 'random_weighted'}

Describe Pipeline¶

Each pipeline is given an id. We can get more information about any particular pipeline using that id. Here, we will get more information about the pipeline with id = 0.

[3]:

automl.describe_pipeline(1)

********************************************
* Cat Boost Binary Classification Pipeline *
********************************************

Problem Type: Binary Classification
Model Family: CatBoost
Number of features: 30

Pipeline Steps
==============
1. Simple Imputer
         * impute_strategy : most_frequent
         * fill_value : None
2. CatBoost Classifier
         * n_estimators : 1000
         * eta : 0.03
         * max_depth : 6

Training
========
Training for Binary Classification problems.
Total training time (including CV): 20.8 seconds

Cross Validation
----------------
               F1  Accuracy Binary  Balanced Accuracy Binary  Precision   AUC  Log Loss Binary  MCC Binary # Training # Testing
0           0.962            0.953                     0.954      0.974 0.987            0.148       0.900    379.000   190.000
1           0.983            0.979                     0.972      0.967 0.995            0.085       0.955    379.000   190.000
2           0.983            0.979                     0.974      0.975 0.997            0.067       0.955    380.000   189.000
mean        0.976            0.970                     0.967      0.972 0.993            0.100       0.937          -         -
std         0.013            0.015                     0.011      0.004 0.005            0.043       0.032          -         -
coef of var 0.013            0.016                     0.012      0.004 0.005            0.427       0.034          -         -

Get Pipeline¶

We can get the object of any pipeline via their id as well:

[4]:

automl.get_pipeline(1)

[4]:

<evalml.pipelines.classification.catboost_binary.CatBoostBinaryClassificationPipeline at 0x7f4664ec53c8>

Get best pipeline¶

If we specifically want to get the best pipeline, there is a convenient access

[5]:

automl.best_pipeline

[5]:

<evalml.pipelines.classification.logistic_regression_binary.LogisticRegressionBinaryPipeline at 0x7f4664f18710>

Feature Importances¶

We can get the feature importances of the resulting pipeline

[6]:

pipeline = automl.get_pipeline(1)
pipeline.feature_importances

[6]:

	feature	importance
0	worst texture	11.023433
1	worst area	9.133809
2	worst radius	8.412493
3	mean concave points	8.321510
4	worst concave points	7.129320
5	mean texture	6.039252
6	worst perimeter	5.919564
7	worst concavity	5.786680
8	worst smoothness	3.957557
9	area error	3.534828
10	worst symmetry	3.071672
11	radius error	2.783052
12	mean concavity	2.629071
13	compactness error	2.393736
14	perimeter error	1.716863
15	mean compactness	1.635428
16	worst compactness	1.599189
17	smoothness error	1.535518
18	concave points error	1.481802
19	mean smoothness	1.470453
20	mean radius	1.321665
21	texture error	1.298512
22	mean symmetry	1.240927
23	mean area	1.228987
24	mean perimeter	1.076955
25	concavity error	0.982101
26	worst fractal dimension	0.967438
27	mean fractal dimension	0.826398
28	fractal dimension error	0.823615
29	symmetry error	0.658173

We can also create a bar plot of the feature importances

[7]:

pipeline.graph_feature_importance()

Precision-Recall Curve¶

For binary classification, you can view the precision-recall curve of a classifier

[8]:

# get the predicted probabilities associated with the "true" label
y_pred_proba = pipeline.predict_proba(X)[:, 1]
evalml.pipelines.graph_utils.graph_precision_recall_curve(y, y_pred_proba)

ROC Curve¶

For binary classification, you can view the ROC curve of a classifier

[9]:

# get the predicted probabilities associated with the "true" label
y_pred_proba = pipeline.predict_proba(X)[:, 1]
evalml.pipelines.graph_utils.graph_roc_curve(y, y_pred_proba)

Confusion Matrix¶

For binary or multiclass classification, you can view a confusion matrix of the classifier’s predictions

[10]:

y_pred = pipeline.predict(X)
evalml.pipelines.graph_utils.graph_confusion_matrix(y, y_pred)

Access raw results¶

You can also get access to all the underlying data, like this:

[11]:

automl.results

[11]:

{'pipeline_results': {0: {'id': 0,
   'pipeline_name': 'Mode Baseline Binary Classification Pipeline',
   'pipeline_class': evalml.pipelines.classification.baseline_binary.ModeBaselineBinaryPipeline,
   'pipeline_summary': 'Baseline Classifier',
   'parameters': {'strategy': 'random_weighted'},
   'score': 0.7710601157203097,
   'high_variance_cv': False,
   'training_time': 0.03633546829223633,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.7702265372168284),
                  ('Accuracy Binary', 0.6263157894736842),
                  ('Balanced Accuracy Binary', 0.5),
                  ('Precision', 0.6263157894736842),
                  ('AUC', 0.5),
                  ('Log Loss Binary', 0.6608932451679239),
                  ('MCC Binary', 0.0),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.7702265372168284},
    {'all_objective_scores': OrderedDict([('F1', 0.7702265372168284),
                  ('Accuracy Binary', 0.6263157894736842),
                  ('Balanced Accuracy Binary', 0.5),
                  ('Precision', 0.6263157894736842),
                  ('AUC', 0.5),
                  ('Log Loss Binary', 0.6608932451679239),
                  ('MCC Binary', 0.0),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.7702265372168284},
    {'all_objective_scores': OrderedDict([('F1', 0.7727272727272727),
                  ('Accuracy Binary', 0.6296296296296297),
                  ('Balanced Accuracy Binary', 0.5),
                  ('Precision', 0.6296296296296297),
                  ('AUC', 0.5),
                  ('Log Loss Binary', 0.6591759924082954),
                  ('MCC Binary', 0.0),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.7727272727272727}]},
  1: {'id': 1,
   'pipeline_name': 'Cat Boost Binary Classification Pipeline',
   'pipeline_class': evalml.pipelines.classification.catboost_binary.CatBoostBinaryClassificationPipeline,
   'pipeline_summary': 'CatBoost Classifier w/ Simple Imputer',
   'parameters': {'Simple Imputer': {'impute_strategy': 'most_frequent',
     'fill_value': None},
    'CatBoost Classifier': {'n_estimators': 1000,
     'eta': 0.03,
     'max_depth': 6}},
   'score': 0.9761688451243576,
   'high_variance_cv': False,
   'training_time': 20.784214735031128,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9617021276595743),
                  ('Accuracy Binary', 0.9526315789473684),
                  ('Balanced Accuracy Binary', 0.9536631554030062),
                  ('Precision', 0.9741379310344828),
                  ('AUC', 0.9874541365842111),
                  ('Log Loss Binary', 0.14774257954380435),
                  ('MCC Binary', 0.9001633057441626),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9617021276595743},
    {'all_objective_scores': OrderedDict([('F1', 0.9834710743801653),
                  ('Accuracy Binary', 0.9789473684210527),
                  ('Balanced Accuracy Binary', 0.971830985915493),
                  ('Precision', 0.967479674796748),
                  ('AUC', 0.9946739259083914),
                  ('Log Loss Binary', 0.08460273019201768),
                  ('MCC Binary', 0.9554966130892879),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9834710743801653},
    {'all_objective_scores': OrderedDict([('F1', 0.9833333333333334),
                  ('Accuracy Binary', 0.9788359788359788),
                  ('Balanced Accuracy Binary', 0.9743697478991598),
                  ('Precision', 0.9752066115702479),
                  ('AUC', 0.9973589435774309),
                  ('Log Loss Binary', 0.06679970055788743),
                  ('MCC Binary', 0.9546019995535027),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9833333333333334}]},
  2: {'id': 2,
   'pipeline_name': 'Logistic Regression Binary Pipeline',
   'pipeline_class': evalml.pipelines.classification.logistic_regression_binary.LogisticRegressionBinaryPipeline,
   'pipeline_summary': 'Logistic Regression Classifier w/ One Hot Encoder + Simple Imputer + Standard Scaler',
   'parameters': {'One Hot Encoder': {'top_n': 10},
    'Simple Imputer': {'impute_strategy': 'most_frequent', 'fill_value': None},
    'Logistic Regression Classifier': {'penalty': 'l2', 'C': 1.0}},
   'score': 0.982018787419635,
   'high_variance_cv': False,
   'training_time': 1.3472530841827393,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1', 0.979253112033195),
                  ('Accuracy Binary', 0.9736842105263158),
                  ('Balanced Accuracy Binary', 0.9676293052432241),
                  ('Precision', 0.9672131147540983),
                  ('AUC', 0.9906497810391762),
                  ('Log Loss Binary', 0.09825657399977614),
                  ('MCC Binary', 0.943843520216036),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.979253112033195},
    {'all_objective_scores': OrderedDict([('F1', 0.9752066115702479),
                  ('Accuracy Binary', 0.968421052631579),
                  ('Balanced Accuracy Binary', 0.9605870517220974),
                  ('Precision', 0.959349593495935),
                  ('AUC', 0.9988164279796425),
                  ('Log Loss Binary', 0.05792932780492265),
                  ('MCC Binary', 0.9327267201397125),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9752066115702479},
    {'all_objective_scores': OrderedDict([('F1', 0.9915966386554622),
                  ('Accuracy Binary', 0.9894179894179894),
                  ('Balanced Accuracy Binary', 0.988655462184874),
                  ('Precision', 0.9915966386554622),
                  ('AUC', 0.9968787515006002),
                  ('Log Loss Binary', 0.06446799374034665),
                  ('MCC Binary', 0.9773109243697479),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9915966386554622}]},
  3: {'id': 3,
   'pipeline_name': 'Random Forest Binary Classification Pipeline',
   'pipeline_class': evalml.pipelines.classification.random_forest_binary.RFBinaryClassificationPipeline,
   'pipeline_summary': 'Random Forest Classifier w/ One Hot Encoder + Simple Imputer',
   'parameters': {'One Hot Encoder': {'top_n': 10},
    'Simple Imputer': {'impute_strategy': 'most_frequent', 'fill_value': None},
    'Random Forest Classifier': {'n_estimators': 100, 'max_depth': 6}},
   'score': 0.9680735152717395,
   'high_variance_cv': False,
   'training_time': 1.7742538452148438,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9617021276595743),
                  ('Accuracy Binary', 0.9526315789473684),
                  ('Balanced Accuracy Binary', 0.9536631554030062),
                  ('Precision', 0.9741379310344828),
                  ('AUC', 0.9844952065333176),
                  ('Log Loss Binary', 0.15369056167628),
                  ('MCC Binary', 0.9001633057441626),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9617021276595743},
    {'all_objective_scores': OrderedDict([('F1', 0.963265306122449),
                  ('Accuracy Binary', 0.9526315789473684),
                  ('Balanced Accuracy Binary', 0.9394602911587171),
                  ('Precision', 0.9365079365079365),
                  ('AUC', 0.9908273168422297),
                  ('Log Loss Binary', 0.12245669921123793),
                  ('MCC Binary', 0.8996571384709533),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.963265306122449},
    {'all_objective_scores': OrderedDict([('F1', 0.979253112033195),
                  ('Accuracy Binary', 0.9735449735449735),
                  ('Balanced Accuracy Binary', 0.9672268907563025),
                  ('Precision', 0.9672131147540983),
                  ('AUC', 0.9975990396158464),
                  ('Log Loss Binary', 0.11890545454349591),
                  ('MCC Binary', 0.9433286178446474),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.979253112033195}]},
  4: {'id': 4,
   'pipeline_name': 'XGBoost Binary Classification Pipeline',
   'pipeline_class': evalml.pipelines.classification.xgboost_binary.XGBoostBinaryPipeline,
   'pipeline_summary': 'XGBoost Classifier w/ One Hot Encoder + Simple Imputer',
   'parameters': {'One Hot Encoder': {'top_n': 10},
    'Simple Imputer': {'impute_strategy': 'most_frequent', 'fill_value': None},
    'XGBoost Classifier': {'eta': 0.1,
     'max_depth': 6,
     'min_child_weight': 1,
     'n_estimators': 100}},
   'score': 0.9707162184435432,
   'high_variance_cv': False,
   'training_time': 0.7069911956787109,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9617021276595743),
                  ('Accuracy Binary', 0.9526315789473684),
                  ('Balanced Accuracy Binary', 0.9536631554030062),
                  ('Precision', 0.9741379310344828),
                  ('AUC', 0.9863889217658894),
                  ('Log Loss Binary', 0.16201562031423428),
                  ('MCC Binary', 0.9001633057441626),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9617021276595743},
    {'all_objective_scores': OrderedDict([('F1', 0.9711934156378601),
                  ('Accuracy Binary', 0.9631578947368421),
                  ('Balanced Accuracy Binary', 0.9535447982009706),
                  ('Precision', 0.9516129032258065),
                  ('AUC', 0.9945555687063559),
                  ('Log Loss Binary', 0.080714067422454),
                  ('MCC Binary', 0.9216584956231404),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9711934156378601},
    {'all_objective_scores': OrderedDict([('F1', 0.979253112033195),
                  ('Accuracy Binary', 0.9735449735449735),
                  ('Balanced Accuracy Binary', 0.9672268907563025),
                  ('Precision', 0.9672131147540983),
                  ('AUC', 0.9971188475390156),
                  ('Log Loss Binary', 0.07802530307330131),
                  ('MCC Binary', 0.9433286178446474),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.979253112033195}]}},
 'search_order': [0, 1, 2, 3, 4]}