Regression Example¶
[1]:
import evalml
from evalml import AutoMLSearch
from evalml.demos import load_diabetes
from evalml.pipelines import PipelineBase, get_pipelines
X, y = evalml.demos.load_diabetes()
automl = AutoMLSearch(problem_type='regression', objective="R2", max_pipelines=5)
automl.search(X, y)
Generating pipelines to search over...
*****************************
* Beginning pipeline search *
*****************************
Optimizing for R2.
Greater score is better.
Searching up to 5 pipelines.
Allowed model families: xgboost, catboost, linear_model, random_forest
✔ Mean Baseline Regression Pipeline: 0%| | Elapsed:00:00
✔ CatBoost Regressor w/ Simple Imputer: 20%|██ | Elapsed:00:03
✔ Linear Regressor w/ Simple Imputer ... 40%|████ | Elapsed:00:03
✔ Random Forest Regressor w/ Simple I... 60%|██████ | Elapsed:00:04
✔ XGBoost Regressor w/ Simple Imputer: 80%|████████ | Elapsed:00:04
✔ Optimization finished 80%|████████ | Elapsed:00:04
[2]:
automl.rankings
[2]:
id | pipeline_name | score | high_variance_cv | parameters | |
---|---|---|---|---|---|
0 | 2 | Linear Regressor w/ Simple Imputer + Standard ... | 0.488703 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
1 | 1 | CatBoost Regressor w/ Simple Imputer | 0.446477 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
2 | 3 | Random Forest Regressor w/ Simple Imputer | 0.441420 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
3 | 4 | XGBoost Regressor w/ Simple Imputer | 0.331082 | False | {'Simple Imputer': {'impute_strategy': 'most_f... |
4 | 0 | Mean Baseline Regression Pipeline | -0.004217 | False | {'Baseline Regressor': {'strategy': 'mean'}} |
[3]:
automl.best_pipeline
[3]:
<evalml.pipelines.utils.make_pipeline.<locals>.GeneratedPipeline at 0x7fba225cb7d0>
[4]:
automl.get_pipeline(0)
[4]:
<evalml.pipelines.regression.baseline_regression.MeanBaselineRegressionPipeline at 0x7fba2153c450>
[5]:
automl.describe_pipeline(0)
*************************************
* Mean Baseline Regression Pipeline *
*************************************
Problem Type: Regression
Model Family: Baseline
Pipeline Steps
==============
1. Baseline Regressor
* strategy : mean
Training
========
Training for Regression problems.
Total training time (including CV): 0.0 seconds
Cross Validation
----------------
R2 Root Mean Squared Error MAE MSE MedianAE MaxError ExpVariance # Training # Testing
0 -0.007 75.863 63.324 5755.216 57.190 186.810 -0.000 294.0 148.0
1 -0.000 79.654 68.759 6344.747 67.966 193.966 0.000 295.0 147.0
2 -0.006 75.705 65.485 5731.187 63.817 170.817 -0.000 295.0 147.0
mean -0.004 77.074 65.856 5943.717 62.991 183.864 -0.000 - -
std 0.004 2.236 2.736 347.510 5.435 11.852 0.000 - -
coef of var -0.866 0.029 0.042 0.058 0.086 0.064 -0.866 - -