Exploring search results

After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.

[1]:
import evalml

X, y = evalml.demos.load_breast_cancer()

clf = evalml.AutoClassifier(objective="f1",
                            max_pipelines=10)

clf.fit(X, y)
*****************************
* Beginning pipeline search *
*****************************

Optimizing for F1. Greater score is better.

Searching up to 10 pipelines.
Possible model types: xgboost, random_forest, linear_model

✔ XGBoost Classifier w/ One Hot Encod...     0%|          | Elapsed:00:00
✔ XGBoost Classifier w/ One Hot Encod...    10%|█         | Elapsed:00:00
✔ Random Forest Classifier w/ One Hot...    20%|██        | Elapsed:00:06
✔ XGBoost Classifier w/ One Hot Encod...    30%|███       | Elapsed:00:06
✔ Logistic Regression Classifier w/ O...    40%|████      | Elapsed:00:14
✔ XGBoost Classifier w/ One Hot Encod...    50%|█████     | Elapsed:00:14
✔ Logistic Regression Classifier w/ O...    60%|██████    | Elapsed:00:21
✔ XGBoost Classifier w/ One Hot Encod...    70%|███████   | Elapsed:00:22
✔ Logistic Regression Classifier w/ O...    80%|████████  | Elapsed:00:29
✔ Logistic Regression Classifier w/ O...    90%|█████████ | Elapsed:00:37
✔ Logistic Regression Classifier w/ O...   100%|██████████| Elapsed:00:37

✔ Optimization finished

View Rankings

A summary of all the pipelines built can be returned as a dataframe. It is sorted by score. EvalML knows based on your objective function whether or not high or lower is better.

[2]:
clf.rankings
[2]:
id pipeline_name score high_variance_cv parameters
0 8 LogisticRegressionPipeline 0.980527 False {'penalty': 'l2', 'C': 0.5765626434012575, 'im...
1 6 LogisticRegressionPipeline 0.974853 False {'penalty': 'l2', 'C': 6.239401330891865, 'imp...
2 9 LogisticRegressionPipeline 0.974853 False {'penalty': 'l2', 'C': 8.123565600467177, 'imp...
3 4 LogisticRegressionPipeline 0.973411 False {'penalty': 'l2', 'C': 8.444214828324364, 'imp...
4 1 XGBoostPipeline 0.970626 False {'eta': 0.38438170729269994, 'min_child_weight...
5 2 RFClassificationPipeline 0.966846 False {'n_estimators': 569, 'max_depth': 22, 'impute...
6 5 XGBoostPipeline 0.966592 False {'eta': 0.6481718720511973, 'min_child_weight'...
7 0 XGBoostPipeline 0.965192 False {'eta': 0.5928446182250184, 'min_child_weight'...
8 7 XGBoostPipeline 0.963913 False {'eta': 0.9786183422327642, 'min_child_weight'...
9 3 XGBoostPipeline 0.952237 False {'eta': 0.5288949197529046, 'min_child_weight'...

Describe Pipeline

Each pipeline is given an id. We can get more information about any particular pipeline using that id

[3]:
clf.describe_pipeline(0)
********************************************************************************************
* XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model *
********************************************************************************************

Problem Types: Binary Classification, Multiclass Classification
Model Type: XGBoost Classifier
Objective to Optimize: F1 (greater is better)
Number of features: 18

Pipeline Steps
==============
1. One Hot Encoder
2. Simple Imputer
         * impute_strategy : most_frequent
3. RF Classifier Select From Model
         * percent_features : 0.6273280598181127
         * threshold : -inf
4. XGBoost Classifier
         * eta : 0.5928446182250184
         * max_depth : 4
         * min_child_weight : 8.598391737229157

Training
========
Training for Binary Classification problems.
Total training time (including CV): 0.2 seconds

Cross Validation
----------------
               F1  Precision  Recall   AUC  Log Loss   MCC # Training # Testing
0           0.950      0.935   0.950 0.985     0.154 0.864    379.000   190.000
1           0.975      0.959   0.975 0.996     0.102 0.933    379.000   190.000
2           0.970      0.991   0.970 0.983     0.137 0.923    380.000   189.000
mean        0.965      0.962   0.965 0.988     0.131 0.907          -         -
std         0.013      0.028   0.013 0.007     0.026 0.037          -         -
coef of var 0.014      0.029   0.014 0.007     0.202 0.041          -         -

Get Pipeline

You can get the object for any pipeline as well

[4]:
clf.get_pipeline(0)
[4]:
<evalml.pipelines.classification.xgboost.XGBoostPipeline at 0x135081990>

Get best pipeline

If you specifically want to get the best pipeline, there is a convenient access.

[5]:
clf.best_pipeline
[5]:
<evalml.pipelines.classification.logistic_regression.LogisticRegressionPipeline at 0x1372054d0>

Feature Importances

We can get the feature importances of the resulting pipeline

[6]:
pipeline = clf.get_pipeline(0)
pipeline.feature_importances
[6]:
feature importance
0 22 0.407441
1 7 0.239457
2 27 0.120609
3 20 0.072031
4 23 0.052818
5 6 0.038344
6 1 0.033962
7 21 0.028949
8 4 0.003987
9 25 0.002403
10 0 0.000000
11 2 0.000000
12 3 0.000000
13 12 0.000000
14 13 0.000000
15 18 0.000000
16 19 0.000000
17 29 0.000000

Access raw results

You can also get access to all the underlying data like this

[7]:
clf.results
[7]:
{0: {'id': 0,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.5928446182250184,
   'min_child_weight': 8.598391737229157,
   'max_depth': 4,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.6273280598181127},
  'score': 0.9651923054186028,
  'high_variance_cv': False,
  'scores': [0.9504132231404958, 0.9752066115702479, 0.9699570815450643],
  'all_objective_scores': [OrderedDict([('F1', 0.9504132231404958),
                ('Precision', 0.9349593495934959),
                ('Recall', 0.9504132231404958),
                ('AUC', 0.984731920937389),
                ('Log Loss', 0.1536501646237938),
                ('MCC', 0.8644170412909863),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9752066115702479),
                ('Precision', 0.959349593495935),
                ('Recall', 0.9752066115702479),
                ('AUC', 0.9960350337318026),
                ('Log Loss', 0.10194972519713798),
                ('MCC', 0.9327267201397125),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9699570815450643),
                ('Precision', 0.9912280701754386),
                ('Recall', 0.9699570815450643),
                ('AUC', 0.983313325330132),
                ('Log Loss', 0.13664108953345075),
                ('MCC', 0.9231826763268304),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 0.248244047164917},
 1: {'id': 1,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.38438170729269994,
   'min_child_weight': 3.677811458900251,
   'max_depth': 13,
   'impute_strategy': 'median',
   'percent_features': 0.793807787701838},
  'score': 0.9706261399583499,
  'high_variance_cv': False,
  'scores': [0.9707112970711297, 0.9709543568464729, 0.9702127659574468],
  'all_objective_scores': [OrderedDict([('F1', 0.9707112970711297),
                ('Precision', 0.9666666666666667),
                ('Recall', 0.9707112970711297),
                ('AUC', 0.9917149958574978),
                ('Log Loss', 0.11573912222489813),
                ('MCC', 0.9211268105467613),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9709543568464729),
                ('Precision', 0.9590163934426229),
                ('Recall', 0.9709543568464729),
                ('AUC', 0.9969227127470707),
                ('Log Loss', 0.07704140599817037),
                ('MCC', 0.9211492315750531),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9702127659574468),
                ('Precision', 0.9827586206896551),
                ('Recall', 0.9702127659574468),
                ('AUC', 0.9857142857142858),
                ('Log Loss', 0.12628072744331484),
                ('MCC', 0.9218075091290715),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 0.29195380210876465},
 2: {'id': 2,
  'pipeline_name': 'RFClassificationPipeline',
  'parameters': {'n_estimators': 569,
   'max_depth': 22,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.8593661614465293},
  'score': 0.9668456397284798,
  'high_variance_cv': False,
  'scores': [0.9508196721311476, 0.979253112033195, 0.970464135021097],
  'all_objective_scores': [OrderedDict([('F1', 0.9508196721311476),
                ('Precision', 0.928),
                ('Recall', 0.9508196721311476),
                ('AUC', 0.9889336016096579),
                ('Log Loss', 0.1388421748025717),
                ('MCC', 0.8647724688764672),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.979253112033195),
                ('Precision', 0.9672131147540983),
                ('Recall', 0.979253112033195),
                ('AUC', 0.9898804592259438),
                ('Log Loss', 0.11232987225229708),
                ('MCC', 0.943843520216036),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.970464135021097),
                ('Precision', 0.9745762711864406),
                ('Recall', 0.970464135021097),
                ('AUC', 0.9906362545018007),
                ('Log Loss', 0.11575295379524118),
                ('MCC', 0.9208800271662652),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 6.06977105140686},
 3: {'id': 3,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.5288949197529046,
   'min_child_weight': 6.112401049845392,
   'max_depth': 6,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.34402219881309576},
  'score': 0.9522372250281359,
  'high_variance_cv': False,
  'scores': [0.9367088607594938, 0.9672131147540983, 0.9527896995708156],
  'all_objective_scores': [OrderedDict([('F1', 0.9367088607594938),
                ('Precision', 0.940677966101695),
                ('Recall', 0.9367088607594938),
                ('AUC', 0.9821872410936205),
                ('Log Loss', 0.16857726289155453),
                ('MCC', 0.8318710075349047),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9672131147540983),
                ('Precision', 0.944),
                ('Recall', 0.9672131147540983),
                ('AUC', 0.9937270682921056),
                ('Log Loss', 0.10433676971098114),
                ('MCC', 0.9106361866954563),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9527896995708156),
                ('Precision', 0.9736842105263158),
                ('Recall', 0.9527896995708156),
                ('AUC', 0.9845138055222089),
                ('Log Loss', 0.14270813120701523),
                ('MCC', 0.8783921421654207),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 0.20792675018310547},
 4: {'id': 4,
  'pipeline_name': 'LogisticRegressionPipeline',
  'parameters': {'penalty': 'l2',
   'C': 8.444214828324364,
   'impute_strategy': 'most_frequent'},
  'score': 0.9734109818152151,
  'high_variance_cv': False,
  'scores': [0.970464135021097, 0.9754098360655737, 0.9743589743589743],
  'all_objective_scores': [OrderedDict([('F1', 0.970464135021097),
                ('Precision', 0.9745762711864406),
                ('Recall', 0.970464135021097),
                ('AUC', 0.9885193514025328),
                ('Log Loss', 0.1943294590819038),
                ('MCC', 0.9215733295732883),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9754098360655737),
                ('Precision', 0.952),
                ('Recall', 0.9754098360655737),
                ('AUC', 0.9849686353414605),
                ('Log Loss', 0.1533799764176819),
                ('MCC', 0.933568045604951),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9743589743589743),
                ('Precision', 0.991304347826087),
                ('Recall', 0.9743589743589743),
                ('AUC', 0.990516206482593),
                ('Log Loss', 0.1164316714613053),
                ('MCC', 0.9336637889421326),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 7.461816072463989},
 5: {'id': 5,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.6481718720511973,
   'min_child_weight': 4.314173858564932,
   'max_depth': 6,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.871312026764351},
  'score': 0.966592074666908,
  'high_variance_cv': False,
  'scores': [0.9543568464730291, 0.9752066115702479, 0.9702127659574468],
  'all_objective_scores': [OrderedDict([('F1', 0.9543568464730291),
                ('Precision', 0.9426229508196722),
                ('Recall', 0.9543568464730291),
                ('AUC', 0.9899396378269618),
                ('Log Loss', 0.12702225128151967),
                ('MCC', 0.8757606542930872),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9752066115702479),
                ('Precision', 0.959349593495935),
                ('Recall', 0.9752066115702479),
                ('AUC', 0.9965676411409634),
                ('Log Loss', 0.0801103590350402),
                ('MCC', 0.9327267201397125),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9702127659574468),
                ('Precision', 0.9827586206896551),
                ('Recall', 0.9702127659574468),
                ('AUC', 0.9858343337334934),
                ('Log Loss', 0.1270006743029361),
                ('MCC', 0.9218075091290715),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 0.33750486373901367},
 6: {'id': 6,
  'pipeline_name': 'LogisticRegressionPipeline',
  'parameters': {'penalty': 'l2',
   'C': 6.239401330891865,
   'impute_strategy': 'median'},
  'score': 0.9748529087969783,
  'high_variance_cv': False,
  'scores': [0.9747899159663865, 0.9754098360655737, 0.9743589743589743],
  'all_objective_scores': [OrderedDict([('F1', 0.9747899159663865),
                ('Precision', 0.9747899159663865),
                ('Recall', 0.9747899159663865),
                ('AUC', 0.9889927802106758),
                ('Log Loss', 0.17491241567239438),
                ('MCC', 0.932536394839626),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9754098360655737),
                ('Precision', 0.952),
                ('Recall', 0.9754098360655737),
                ('AUC', 0.9870990649781038),
                ('Log Loss', 0.13982009938625542),
                ('MCC', 0.933568045604951),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9743589743589743),
                ('Precision', 0.991304347826087),
                ('Recall', 0.9743589743589743),
                ('AUC', 0.990516206482593),
                ('Log Loss', 0.1109645583402926),
                ('MCC', 0.9336637889421326),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 7.343135118484497},
 7: {'id': 7,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.9786183422327642,
   'min_child_weight': 8.192427077950514,
   'max_depth': 20,
   'impute_strategy': 'median',
   'percent_features': 0.6820907348177707},
  'score': 0.9639126305792973,
  'high_variance_cv': False,
  'scores': [0.9547325102880658, 0.9711934156378601, 0.9658119658119659],
  'all_objective_scores': [OrderedDict([('F1', 0.9547325102880658),
                ('Precision', 0.9354838709677419),
                ('Recall', 0.9547325102880658),
                ('AUC', 0.9853237069475678),
                ('Log Loss', 0.15021697619047605),
                ('MCC', 0.8759603969361893),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9711934156378601),
                ('Precision', 0.9516129032258065),
                ('Recall', 0.9711934156378601),
                ('AUC', 0.9950289975144987),
                ('Log Loss', 0.10607622409680564),
                ('MCC', 0.9216584956231404),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9658119658119659),
                ('Precision', 0.9826086956521739),
                ('Recall', 0.9658119658119659),
                ('AUC', 0.9834333733493397),
                ('Log Loss', 0.13131227825704234),
                ('MCC', 0.9112159507396058),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 0.26775383949279785},
 8: {'id': 8,
  'pipeline_name': 'LogisticRegressionPipeline',
  'parameters': {'penalty': 'l2',
   'C': 0.5765626434012575,
   'impute_strategy': 'mean'},
  'score': 0.9805269796885542,
  'high_variance_cv': False,
  'scores': [0.9874476987447698, 0.9754098360655737, 0.9787234042553192],
  'all_objective_scores': [OrderedDict([('F1', 0.9874476987447698),
                ('Precision', 0.9833333333333333),
                ('Recall', 0.9874476987447698),
                ('AUC', 0.994910640312463),
                ('Log Loss', 0.08726565374201126),
                ('MCC', 0.9662335358054943),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9754098360655737),
                ('Precision', 0.952),
                ('Recall', 0.9754098360655737),
                ('AUC', 0.9979879275653923),
                ('Log Loss', 0.0764559127800754),
                ('MCC', 0.933568045604951),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9787234042553192),
                ('Precision', 0.9913793103448276),
                ('Recall', 0.9787234042553192),
                ('AUC', 0.9903961584633854),
                ('Log Loss', 0.09774553003325108),
                ('MCC', 0.9443109474170326),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 7.57702112197876},
 9: {'id': 9,
  'pipeline_name': 'LogisticRegressionPipeline',
  'parameters': {'penalty': 'l2',
   'C': 8.123565600467177,
   'impute_strategy': 'median'},
  'score': 0.9748529087969783,
  'high_variance_cv': False,
  'scores': [0.9747899159663865, 0.9754098360655737, 0.9743589743589743],
  'all_objective_scores': [OrderedDict([('F1', 0.9747899159663865),
                ('Precision', 0.9747899159663865),
                ('Recall', 0.9747899159663865),
                ('AUC', 0.9886377086045686),
                ('Log Loss', 0.19170510282820305),
                ('MCC', 0.932536394839626),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9754098360655737),
                ('Precision', 0.952),
                ('Recall', 0.9754098360655737),
                ('AUC', 0.9850869925434962),
                ('Log Loss', 0.15159254810085362),
                ('MCC', 0.933568045604951),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9743589743589743),
                ('Precision', 0.991304347826087),
                ('Recall', 0.9743589743589743),
                ('AUC', 0.990516206482593),
                ('Log Loss', 0.11566930634571038),
                ('MCC', 0.9336637889421326),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 7.280526161193848}}