OutliersDataCheck.
validate
Checks if there are any outliers in a dataframe by using IQR to determine column anomalies. Column with anomalies are considered to contain outliers.
X (ww.DataTable, pd.DataFrame, np.ndarray) – Features
y (ww.DataColumn, pd.Series, np.ndarray) – Ignored.
A dictionary with warnings if any columns have outliers.
dict
Example
>>> df = pd.DataFrame({ ... 'x': [1, 2, 3, 4, 5], ... 'y': [6, 7, 8, 9, 10], ... 'z': [-1, -2, -3, -1201, -4] ... }) >>> outliers_check = OutliersDataCheck() >>> assert outliers_check.validate(df) == {"warnings": [{"message": "Column(s) 'z' are likely to have outlier data.", "data_check_name": "OutliersDataCheck", "level": "warning", "code": "HAS_OUTLIERS", "details": {"columns": ["z"]}}], "errors": []}