HighlyNullDataCheck.
validate
Checks if there are any highly-null columns in the input.
X (pd.DataFrame, pd.Series, np.array, list) – features
y – Ignored.
list with a DataCheckWarning if there are any highly-null columns.
list (DataCheckWarning)
Example
>>> df = pd.DataFrame({ ... 'lots_of_null': [None, None, None, None, 5], ... 'no_null': [1, 2, 3, 4, 5] ... }) >>> null_check = HighlyNullDataCheck(pct_null_threshold=0.8) >>> assert null_check.validate(df) == [DataCheckWarning("Column 'lots_of_null' is 80.0% or more null", "HighlyNullDataCheck")]