Skip to content

Pipeline

Drop-in CatBoost classes with the CBX layer baked in.

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from catboost_utils import CBXRegressor

pipe = Pipeline([
    ("scaler", StandardScaler()),
    ("model", CBXRegressor(iterations=100)),
])
pipe.fit(X, y)

CBX-specific parameters

  • auto_cat_features: bool = True — infer categorical features from pandas dtypes (object, category, string, bool) at fit time. Set False to disable.
  • nan_fill: str | dict[str, str] | None = None — explicit NaN handling for cat features. str: same fill value across all cat columns. dict: per-column mapping. None: no auto-fill.
  • early_stopping: "auto" | None = None — when "auto" and eval_set is provided, sets od_type="Iter", od_wait=50, use_best_model=True. Raises CBXError if "auto" but no eval_set.

sklearn compatibility

Tested with Pipeline, GridSearchCV, cross_val_score, and clone(). Pool inputs pass through unchanged.

catboost_utils.pipeline.classifier.CBXClassifier

CBXClassifier(
    *,
    auto_cat_features: bool = True,
    nan_fill: NanFill = None,
    early_stopping: EarlyStopping = None,
    **catboost_params: Any,
)

Bases: _CBXMixin, CatBoostClassifier

CBX-enhanced CatBoostClassifier. See SPEC.md §Module 4.

catboost_utils.pipeline.regressor.CBXRegressor

CBXRegressor(
    *,
    auto_cat_features: bool = True,
    nan_fill: NanFill = None,
    early_stopping: EarlyStopping = None,
    **catboost_params: Any,
)

Bases: _CBXMixin, CatBoostRegressor

CBX-enhanced CatBoostRegressor. See SPEC.md §Module 4.