Exploring search results¶

After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.

[1]:

import evalml
from evalml import AutoClassificationSearch

X, y = evalml.demos.load_breast_cancer()

automl = AutoClassificationSearch(objective="f1",
                                  max_pipelines=5)

automl.search(X, y)

*****************************
* Beginning pipeline search *
*****************************

Optimizing for F1. Greater score is better.

Searching up to 5 pipelines.

✔ XGBoost Binary Classification Pipel...    20%|██        | Elapsed:00:05
✔ Random Forest Binary Classification...    40%|████      | Elapsed:00:18
✔ Logistic Regression Binary Pipeline:      60%|██████    | Elapsed:00:19
✔ XGBoost Binary Classification Pipel...    80%|████████  | Elapsed:00:26
✔ XGBoost Binary Classification Pipel...   100%|██████████| Elapsed:00:32
✔ Optimization finished                    100%|██████████| Elapsed:00:32

View Rankings¶

A summary of all the pipelines built can be returned as a pandas DataFrame. It is sorted by score. EvalML knows based on our objective function whether higher or lower is better.

[2]:

automl.rankings

[2]:

	id	pipeline_name	score	high_variance_cv	parameters
0	2	Logistic Regression Binary Pipeline	0.982042	False	{'impute_strategy': 'mean', 'penalty': 'l2', '...
1	0	XGBoost Binary Classification Pipeline	0.976191	False	{'impute_strategy': 'most_frequent', 'percent_...
2	1	Random Forest Binary Classification Pipeline	0.958032	False	{'impute_strategy': 'median', 'percent_feature...

Describe Pipeline¶

Each pipeline is given an id. We can get more information about any particular pipeline using that id. Here, we will get more information about the pipeline with id = 0.

[3]:

automl.describe_pipeline(0)

******************************************
* XGBoost Binary Classification Pipeline *
******************************************

Problem Type: Binary Classification
Model Family: XGBoost
Number of features: 25

Pipeline Steps
==============
1. One Hot Encoder
         * top_n : 10
2. Simple Imputer
         * impute_strategy : most_frequent
         * fill_value : None
3. RF Classifier Select From Model
         * percent_features : 0.8487792213962843
         * threshold : -inf
4. XGBoost Classifier
         * eta : 0.38438170729269994
         * max_depth : 7
         * min_child_weight : 1.5104167958569887
         * n_estimators : 397

Training
========
Training for Binary Classification problems.
Total training time (including CV): 5.4 seconds

Cross Validation
----------------
               F1  Accuracy Binary  Balanced Accuracy Binary  Precision  Recall   AUC  Log Loss Binary  MCC Binary # Training # Testing
0           0.962            0.953                     0.954      0.974   0.950 0.988            0.138       0.900    379.000   190.000
1           0.979            0.974                     0.965      0.960   1.000 0.997            0.071       0.945    379.000   190.000
2           0.987            0.984                     0.982      0.983   0.992 0.997            0.075       0.966    380.000   189.000
mean        0.976            0.970                     0.967      0.972   0.980 0.994            0.095       0.937          -         -
std         0.013            0.016                     0.014      0.012   0.027 0.005            0.037       0.034          -         -
coef of var 0.013            0.017                     0.015      0.012   0.028 0.006            0.395       0.036          -         -

Get Pipeline¶

We can get the object of any pipeline via their id as well:

[4]:

automl.get_pipeline(0)

[4]:

<evalml.pipelines.classification.xgboost_binary.XGBoostBinaryPipeline at 0x7fb4afd23b00>

Get best pipeline¶

If we specifically want to get the best pipeline, there is a convenient access

[5]:

automl.best_pipeline

[5]:

<evalml.pipelines.classification.logistic_regression_binary.LogisticRegressionBinaryPipeline at 0x7fb4af46ef28>

Feature Importances¶

We can get the feature importances of the resulting pipeline

[6]:

pipeline = automl.get_pipeline(0)
pipeline.feature_importances

[6]:

	feature	importance
0	mean concave points	0.465049
1	worst concave points	0.246494
2	worst radius	0.089427
3	worst area	0.045472
4	mean texture	0.029848
5	worst concavity	0.020971
6	area error	0.020298
7	radius error	0.018571
8	worst texture	0.014910
9	worst smoothness	0.010209
10	mean area	0.006383
11	mean concavity	0.004976
12	mean smoothness	0.004681
13	worst perimeter	0.004660
14	worst symmetry	0.004073
15	concavity error	0.003436
16	mean compactness	0.003422
17	worst fractal dimension	0.002782
18	smoothness error	0.001911
19	fractal dimension error	0.001905
20	symmetry error	0.000420
21	perimeter error	0.000101
22	mean radius	0.000000
23	mean perimeter	0.000000
24	worst compactness	0.000000

We can also create a bar plot of the feature importances

[7]:

pipeline.graph_feature_importance()

Access raw results¶

You can also get access to all the underlying data, like this:

[8]:

automl.results

[8]:

{'pipeline_results': {0: {'id': 0,
   'pipeline_name': 'XGBoost Binary Classification Pipeline',
   'pipeline_summary': 'XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
   'parameters': {'impute_strategy': 'most_frequent',
    'percent_features': 0.8487792213962843,
    'threshold': -inf,
    'eta': 0.38438170729269994,
    'max_depth': 7,
    'min_child_weight': 1.5104167958569887,
    'n_estimators': 397},
   'score': 0.9761912315723671,
   'high_variance_cv': False,
   'training_time': 5.410717964172363,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9617021276595743),
                  ('Accuracy Binary', 0.9526315789473684),
                  ('Balanced Accuracy Binary', 0.9536631554030062),
                  ('Precision', 0.9741379310344828),
                  ('Recall', 0.9495798319327731),
                  ('AUC', 0.9876908509882827),
                  ('Log Loss Binary', 0.13808748615334288),
                  ('MCC Binary', 0.9001633057441626),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9617021276595743},
    {'all_objective_scores': OrderedDict([('F1', 0.9794238683127572),
                  ('Accuracy Binary', 0.9736842105263158),
                  ('Balanced Accuracy Binary', 0.9647887323943662),
                  ('Precision', 0.9596774193548387),
                  ('Recall', 1.0),
                  ('AUC', 0.9973961415552136),
                  ('Log Loss Binary', 0.07131786501827025),
                  ('MCC Binary', 0.9445075449666159),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9794238683127572},
    {'all_objective_scores': OrderedDict([('F1', 0.9874476987447698),
                  ('Accuracy Binary', 0.9841269841269841),
                  ('Balanced Accuracy Binary', 0.9815126050420169),
                  ('Precision', 0.9833333333333333),
                  ('Recall', 0.9915966386554622),
                  ('AUC', 0.996998799519808),
                  ('Log Loss Binary', 0.07531116866342562),
                  ('MCC Binary', 0.9659285184801715),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9874476987447698}]},
  1: {'id': 1,
   'pipeline_name': 'Random Forest Binary Classification Pipeline',
   'pipeline_summary': 'Random Forest Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
   'parameters': {'impute_strategy': 'median',
    'percent_features': 0.8140470414877383,
    'threshold': 'mean',
    'n_estimators': 859,
    'max_depth': 6},
   'score': 0.9580315415303952,
   'high_variance_cv': False,
   'training_time': 12.62360143661499,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9361702127659575),
                  ('Accuracy Binary', 0.9210526315789473),
                  ('Balanced Accuracy Binary', 0.9199313528228192),
                  ('Precision', 0.9482758620689655),
                  ('Recall', 0.9243697478991597),
                  ('AUC', 0.9766836311989585),
                  ('Log Loss Binary', 0.20455160484518806),
                  ('MCC Binary', 0.833232300751445),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9361702127659575},
    {'all_objective_scores': OrderedDict([('F1', 0.9672131147540983),
                  ('Accuracy Binary', 0.9578947368421052),
                  ('Balanced Accuracy Binary', 0.9465025446798438),
                  ('Precision', 0.944),
                  ('Recall', 0.9915966386554622),
                  ('AUC', 0.9838442419221209),
                  ('Log Loss Binary', 0.14826817405619716),
                  ('MCC Binary', 0.9106361866954563),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9672131147540983},
    {'all_objective_scores': OrderedDict([('F1', 0.9707112970711297),
                  ('Accuracy Binary', 0.9629629629629629),
                  ('Balanced Accuracy Binary', 0.9588235294117646),
                  ('Precision', 0.9666666666666667),
                  ('Recall', 0.9747899159663865),
                  ('AUC', 0.9942376950780312),
                  ('Log Loss Binary', 0.10344817959803934),
                  ('MCC Binary', 0.9204135621119959),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9707112970711297}]},
  2: {'id': 2,
   'pipeline_name': 'Logistic Regression Binary Pipeline',
   'pipeline_summary': 'Logistic Regression Classifier w/ One Hot Encoder + Simple Imputer + Standard Scaler',
   'parameters': {'impute_strategy': 'mean',
    'penalty': 'l2',
    'C': 0.21198179042885398},
   'score': 0.9820415596969072,
   'high_variance_cv': False,
   'training_time': 1.3488342761993408,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1', 0.979253112033195),
                  ('Accuracy Binary', 0.9736842105263158),
                  ('Balanced Accuracy Binary', 0.9676293052432241),
                  ('Precision', 0.9672131147540983),
                  ('Recall', 0.9915966386554622),
                  ('AUC', 0.9904130666351048),
                  ('Log Loss Binary', 0.10058063355386729),
                  ('MCC Binary', 0.943843520216036),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.979253112033195},
    {'all_objective_scores': OrderedDict([('F1', 0.9794238683127572),
                  ('Accuracy Binary', 0.9736842105263158),
                  ('Balanced Accuracy Binary', 0.9647887323943662),
                  ('Precision', 0.9596774193548387),
                  ('Recall', 1.0),
                  ('AUC', 0.9989347851816782),
                  ('Log Loss Binary', 0.07682029301742287),
                  ('MCC Binary', 0.9445075449666159),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9794238683127572},
    {'all_objective_scores': OrderedDict([('F1', 0.9874476987447698),
                  ('Accuracy Binary', 0.9841269841269841),
                  ('Balanced Accuracy Binary', 0.9815126050420169),
                  ('Precision', 0.9833333333333333),
                  ('Recall', 0.9915966386554622),
                  ('AUC', 0.997358943577431),
                  ('Log Loss Binary', 0.08090403408994591),
                  ('MCC Binary', 0.9659285184801715),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9874476987447698}]},
  3: {'id': 3,
   'pipeline_name': 'XGBoost Binary Classification Pipeline',
   'pipeline_summary': 'XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
   'parameters': {'impute_strategy': 'most_frequent',
    'percent_features': 0.14894727260851873,
    'threshold': -inf,
    'eta': 0.4736080452737106,
    'max_depth': 18,
    'min_child_weight': 5.153314260276387,
    'n_estimators': 660},
   'score': 0.941255546698183,
   'high_variance_cv': False,
   'training_time': 6.667321681976318,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9264069264069265),
                  ('Accuracy Binary', 0.9105263157894737),
                  ('Balanced Accuracy Binary', 0.9143685643271393),
                  ('Precision', 0.9553571428571429),
                  ('Recall', 0.8991596638655462),
                  ('AUC', 0.9715942715114214),
                  ('Log Loss Binary', 0.2351054900534157),
                  ('MCC Binary', 0.8150103776135726),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9264069264069265},
    {'all_objective_scores': OrderedDict([('F1', 0.9482071713147411),
                  ('Accuracy Binary', 0.9315789473684211),
                  ('Balanced Accuracy Binary', 0.9084507042253521),
                  ('Precision', 0.9015151515151515),
                  ('Recall', 1.0),
                  ('AUC', 0.9784589892294946),
                  ('Log Loss Binary', 0.18131056035061574),
                  ('MCC Binary', 0.858166066103978),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9482071713147411},
    {'all_objective_scores': OrderedDict([('F1', 0.9491525423728814),
                  ('Accuracy Binary', 0.9365079365079365),
                  ('Balanced Accuracy Binary', 0.9348739495798319),
                  ('Precision', 0.9572649572649573),
                  ('Recall', 0.9411764705882353),
                  ('AUC', 0.9841536614645858),
                  ('Log Loss Binary', 0.16492396169563844),
                  ('MCC Binary', 0.8648817040445186),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9491525423728814}]},
  4: {'id': 4,
   'pipeline_name': 'XGBoost Binary Classification Pipeline',
   'pipeline_summary': 'XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
   'parameters': {'impute_strategy': 'mean',
    'percent_features': 0.6435218111142487,
    'threshold': 'mean',
    'eta': 0.9446689170495841,
    'max_depth': 11,
    'min_child_weight': 4.731957459914713,
    'n_estimators': 676},
   'score': 0.9486606279409701,
   'high_variance_cv': False,
   'training_time': 6.609421014785767,
   'cv_data': [{'all_objective_scores': OrderedDict([('F1',
                   0.9210526315789473),
                  ('Accuracy Binary', 0.9052631578947369),
                  ('Balanced Accuracy Binary', 0.9130074565037283),
                  ('Precision', 0.963302752293578),
                  ('Recall', 0.8823529411764706),
                  ('AUC', 0.975085808971476),
                  ('Log Loss Binary', 0.2385086150043016),
                  ('MCC Binary', 0.8080435814236837),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9210526315789473},
    {'all_objective_scores': OrderedDict([('F1', 0.9709543568464729),
                  ('Accuracy Binary', 0.9631578947368421),
                  ('Balanced Accuracy Binary', 0.9563853710498283),
                  ('Precision', 0.9590163934426229),
                  ('Recall', 0.9831932773109243),
                  ('AUC', 0.9697597348798673),
                  ('Log Loss Binary', 0.13901819948468505),
                  ('MCC Binary', 0.9211492315750531),
                  ('# Training', 379),
                  ('# Testing', 190)]),
     'score': 0.9709543568464729},
    {'all_objective_scores': OrderedDict([('F1', 0.9539748953974896),
                  ('Accuracy Binary', 0.9417989417989417),
                  ('Balanced Accuracy Binary', 0.9361344537815126),
                  ('Precision', 0.95),
                  ('Recall', 0.957983193277311),
                  ('AUC', 0.9845738295318127),
                  ('Log Loss Binary', 0.13538144654258666),
                  ('MCC Binary', 0.8748986057438203),
                  ('# Training', 380),
                  ('# Testing', 189)]),
     'score': 0.9539748953974896}]}},
 'search_order': [0, 1, 2, 3, 4]}