Explain¶
Helpers around feature importance, SHAP, and early-stopping configuration.
from catboost_utils.explain import feature_importance, shap_values, check_early_stopping
fi = feature_importance(model, X)
sv = shap_values(model, X)
check_early_stopping(model, eval_set=(X_val, y_val))
feature_importance returns a sorted DataFrame with named features (resolved from data → Pool → model.feature_names_ → positional fallback). DataFrame inputs are auto-promoted to Pool using model.get_cat_feature_indices() so categorical columns don't crash.
shap_values returns a DataFrame with one column per feature plus expected_value. It pre-emptively detects the "non-zero approx for zero-weight leaf" misconfiguration and raises a clear CBXError. Multiclass SHAP is not yet supported; use the raw API for that.
check_early_stopping raises CBXError when any of od_type, od_wait, od_pval, or early_stopping_rounds is set without an eval_set.
catboost_utils.explain.importance.feature_importance ¶
feature_importance(
model: Any,
data: Any | None = None,
*,
type: ImportanceType = "PredictionValuesChange",
prettified: bool = True,
) -> pd.DataFrame | npt.NDArray[Any]
Return feature importances with feature names attached.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Any
|
a fitted CatBoost model (or CBXClassifier/CBXRegressor). |
required |
data
|
Any | None
|
|
None
|
type
|
ImportanceType
|
CatBoost importance type. |
'PredictionValuesChange'
|
prettified
|
bool
|
when |
True
|
Returns:
| Type | Description |
|---|---|
DataFrame | NDArray[Any]
|
DataFrame or ndarray depending on |
Raises:
| Type | Description |
|---|---|
CBXError
|
when |
catboost_utils.explain.shap.shap_values ¶
Compute SHAP values and return them as a named DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Any
|
a fitted CatBoost / CBX model. |
required |
data
|
Any
|
|
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with one column per feature plus |
DataFrame
|
bias / base value) as the last column. Rows correspond to the input rows. |
Raises:
| Type | Description |
|---|---|
CBXError
|
if data is None, if the model is multiclass (not supported in v0.1), or if CatBoost reports the well-known "non-zero approx for zero-weight leaf" misconfiguration. |
catboost_utils.explain.early_stopping.check_early_stopping ¶
Validate early-stopping configuration before fit().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Any
|
a CatBoost / CBX model instance. |
required |
eval_set
|
Any | None
|
the eval_set you intend to pass to |
None
|
Raises:
| Type | Description |
|---|---|
CBXError
|
when any early-stopping parameter is set but |