Custom Pipelines in EvalML

EvalML pipelines consist of modular components combining any number of transformers and an estimator. This allows you to create pipelines that fit the needs of your data to achieve the best results.

Requirements

A custom pipeline must adhere to the following requirements:

  1. Inherit from the proper pipeline base class

    • Binary classification - BinaryClassificationPipeline

    • Multiclass classification - MulticlassClassificationPipeline

    • Regression - RegressionPipeline

  2. Have a component_graph list as a class variable detailing the structure of the pipeline. Each component in the graph can be provided as either a string name or an instance.

Pipeline Configuration

There are a few other options to configure your custom pipeline.

Custom Name

By default, a pipeline class’s name property is the result of adding spaces between each Pascal case capitalization in the class name. E.g. LogisticRegressionPipeline.name will return ‘Logistic Regression Pipeline’. Therefore, we suggest custom pipelines use Pascal case for their class names.

If you’d like to override the pipeline classes name attribute so it isn’t derived from the class name, you can set the custom_name attribute, like so:

[1]:
from evalml.pipelines import BinaryClassificationPipeline

class CustomPipeline(BinaryClassificationPipeline):
    component_graph = ['Simple Imputer', 'Logistic Regression Classifier']
    custom_name = 'A custom pipeline name'

print(CustomPipeline.name)
A custom pipeline name

Custom Hyperparameters

To specify custom hyperparameter ranges, set the custom_hyperparameters property to be a dictionary where each key-value pair consists of a parameter name and range. AutoML will use this dictionary to override the hyperparameter ranges collected from each component in the component graph.

[2]:
class CustomPipeline(BinaryClassificationPipeline):
    component_graph = ['Simple Imputer', 'Logistic Regression Classifier']

print("Without custom hyperparameters:")
print(CustomPipeline.hyperparameters)

class CustomPipeline(BinaryClassificationPipeline):
    component_graph = ['Simple Imputer', 'Logistic Regression Classifier']
    custom_hyperparameters = {
        'Simple Imputer' : {
            'impute_strategy': ['most_frequent']
        }
    }

print()
print("With custom hyperparameters:")
print(CustomPipeline.hyperparameters)
Without custom hyperparameters:
{'Simple Imputer': {'impute_strategy': ['mean', 'median', 'most_frequent']}, 'Logistic Regression Classifier': {'penalty': ['l2'], 'C': Real(low=0.01, high=10, prior='uniform', transform='identity')}}

With custom hyperparameters:
{'Simple Imputer': {'impute_strategy': ['most_frequent']}, 'Logistic Regression Classifier': {'penalty': ['l2'], 'C': Real(low=0.01, high=10, prior='uniform', transform='identity')}}