Перейти к содержанию

Справочник API

Сгенерирован из docstring'ов. Описание модулей и примеры — на отдельных страницах в разделе «Модули».

Верхний уровень

catboost_utils

catboost_utils — UX wrapper over CatBoost.

Public API is re-exported here. Sub-modules (errors, validation, objectives, pipeline, logging, explain, io, callbacks) are independent and can be imported individually.

CBXClassifier

CBXClassifier(
    *,
    auto_cat_features: bool = True,
    nan_fill: NanFill = None,
    early_stopping: EarlyStopping = None,
    **catboost_params: Any,
)

Bases: _CBXMixin, CatBoostClassifier

CBX-enhanced CatBoostClassifier. See SPEC.md §Module 4.

CBXError

CBXError(
    *,
    original_error: BaseException,
    human_message: str,
    hint: str,
    feature_name: str | None = None,
    feature_idx: int | None = None,
)

Bases: Exception

A CatBoost error translated into a human-readable form.

Attributes:

Name Type Description
original_error BaseException

the underlying CatBoostError (or other exception)

human_message str

short description in plain English

hint str

actionable advice for the user

feature_name str | None

resolved feature name when feature_idx was extractable

feature_idx int | None

raw feature index from the original error

CBXRegressor

CBXRegressor(
    *,
    auto_cat_features: bool = True,
    nan_fill: NanFill = None,
    early_stopping: EarlyStopping = None,
    **catboost_params: Any,
)

Bases: _CBXMixin, CatBoostRegressor

CBX-enhanced CatBoostRegressor. See SPEC.md §Module 4.

unwrap

unwrap(model: Any) -> Any

Reverse wrap(): restore the original CatBoost class on the instance.

validate

validate(
    X: DataFrame,
    y: Series | None = None,
    *,
    cat_features: list[str | int] | None = None,
    eval_set: tuple[DataFrame, Any] | None = None,
    model_params: dict[str, Any] | None = None,
) -> ValidationReport

Run pre-flight checks against a training DataFrame.

Parameters:

Name Type Description Default
X DataFrame

feature DataFrame (a catboost.Pool raises NotImplementedError).

required
y Series | None

target series. Optional — when None, target checks are skipped.

None
cat_features list[str | int] | None

list of categorical column names or positional indices.

None
eval_set tuple[DataFrame, Any] | None

optional (X_eval, y_eval) tuple — checks column alignment.

None
model_params dict[str, Any] | None

optional dict of CatBoost params for cross-cutting warnings (class_weights conflict, GPU / multi-thread reproducibility).

None

Returns:

Type Description
ValidationReport

ValidationReport with ok, issues, warnings.

Raises:

Type Description
NotImplementedError

when X is a catboost.Pool.

TypeError

when X is not a DataFrame.

wrap

wrap(model: T, *, validate: bool = False) -> T

Enhance an existing CatBoost model with readable errors and feature-name resolution.

Mutates model in place by swapping __class__; returns the same object for chaining. isinstance(model, CatBoostClassifier) continues to work.

Parameters:

Name Type Description Default
model T

a CatBoostClassifier or CatBoostRegressor instance.

required
validate bool

reserved for future use; pre-flight validation hook.

False

Returns:

Type Description
T

The same model instance with enhanced error reporting.

Подпакеты

catboost_utils.errors

Readable errors for CatBoost. See SPEC.md §Module 1.

CBXError

CBXError(
    *,
    original_error: BaseException,
    human_message: str,
    hint: str,
    feature_name: str | None = None,
    feature_idx: int | None = None,
)

Bases: Exception

A CatBoost error translated into a human-readable form.

Attributes:

Name Type Description
original_error BaseException

the underlying CatBoostError (or other exception)

human_message str

short description in plain English

hint str

actionable advice for the user

feature_name str | None

resolved feature name when feature_idx was extractable

feature_idx int | None

raw feature index from the original error

unwrap

unwrap(model: Any) -> Any

Reverse wrap(): restore the original CatBoost class on the instance.

wrap

wrap(model: T, *, validate: bool = False) -> T

Enhance an existing CatBoost model with readable errors and feature-name resolution.

Mutates model in place by swapping __class__; returns the same object for chaining. isinstance(model, CatBoostClassifier) continues to work.

Parameters:

Name Type Description Default
model T

a CatBoostClassifier or CatBoostRegressor instance.

required
validate bool

reserved for future use; pre-flight validation hook.

False

Returns:

Type Description
T

The same model instance with enhanced error reporting.

catboost_utils.validation

Pre-flight data validation. See SPEC.md §Module 2.

ValidationError

ValidationError(report: ValidationReport)

Bases: Exception

Raised by ValidationReport.raise_if_failed() when blocking issues are present.

ValidationIssue

Bases: BaseModel

Blocking problem — training will (almost certainly) fail.

ValidationReport

Bases: BaseModel

Full result of validate().

raise_if_failed

raise_if_failed() -> None

Raise ValidationError if there are any blocking issues.

ValidationWarning

Bases: BaseModel

Non-blocking concern — training proceeds but the user should be aware.

validate

validate(
    X: DataFrame,
    y: Series | None = None,
    *,
    cat_features: list[str | int] | None = None,
    eval_set: tuple[DataFrame, Any] | None = None,
    model_params: dict[str, Any] | None = None,
) -> ValidationReport

Run pre-flight checks against a training DataFrame.

Parameters:

Name Type Description Default
X DataFrame

feature DataFrame (a catboost.Pool raises NotImplementedError).

required
y Series | None

target series. Optional — when None, target checks are skipped.

None
cat_features list[str | int] | None

list of categorical column names or positional indices.

None
eval_set tuple[DataFrame, Any] | None

optional (X_eval, y_eval) tuple — checks column alignment.

None
model_params dict[str, Any] | None

optional dict of CatBoost params for cross-cutting warnings (class_weights conflict, GPU / multi-thread reproducibility).

None

Returns:

Type Description
ValidationReport

ValidationReport with ok, issues, warnings.

Raises:

Type Description
NotImplementedError

when X is a catboost.Pool.

TypeError

when X is not a DataFrame.

catboost_utils.objectives

Custom loss / metric decorators. See SPEC.md §Module 3.

metric

metric(
    *, task: TaskType, name: str, higher_is_better: bool
) -> Callable[[Callable[..., float]], Any]

Wrap a user function as a CatBoost custom evaluation metric.

objective

objective(
    *, task: TaskType
) -> Callable[[Callable[..., Any]], Any]

Wrap a user function as a CatBoost custom objective.

catboost_utils.pipeline

Drop-in CBXClassifier/CBXRegressor with sklearn compatibility. See SPEC.md §Module 4.

NOTE: stub — full implementation lands in a follow-up commit.

CBXClassifier

CBXClassifier(
    *,
    auto_cat_features: bool = True,
    nan_fill: NanFill = None,
    early_stopping: EarlyStopping = None,
    **catboost_params: Any,
)

Bases: _CBXMixin, CatBoostClassifier

CBX-enhanced CatBoostClassifier. See SPEC.md §Module 4.

CBXRegressor

CBXRegressor(
    *,
    auto_cat_features: bool = True,
    nan_fill: NanFill = None,
    early_stopping: EarlyStopping = None,
    **catboost_params: Any,
)

Bases: _CBXMixin, CatBoostRegressor

CBX-enhanced CatBoostRegressor. See SPEC.md §Module 4.

catboost_utils.logging

Structured logging for CatBoost training output. See SPEC.md §Module 5.

attach

attach(model: Any) -> None

Attach a logging-backed stream to model for subsequent fit() calls.

log_cout / log_cerr are fit-time parameters in CatBoost (not init params); we monkey-patch the bound fit method to inject them. Idempotent: a second attach() call replaces the previous streams.

Skips (with a warning) when the user already set verbose / logging_level to avoid duplicating output to stdout.

format_iteration_record

format_iteration_record(rec: IterationRecord) -> str

Format a parsed record as key=value pairs (insertion order).

parse_iteration_line

parse_iteration_line(line: str) -> IterationRecord | None

Try to parse one CatBoost iteration line. Returns None if not a match.

setup_logging

setup_logging(
    level: int = stdlib_logging.INFO,
    *,
    structured: bool = False,
    stream: TextIO | None = None,
) -> stdlib_logging.Logger

Configure the catboost_utils.training logger.

Parameters:

Name Type Description Default
level int

standard logging level.

INFO
structured bool

when True, every record is rendered as a single JSON object.

False
stream TextIO | None

where to send output (defaults to sys.stderr).

None

Returns:

Type Description
Logger

The configured logger. Idempotent — calling twice replaces the existing handler.

catboost_utils.explain

Explanation helpers: feature_importance, shap_values, check_early_stopping.

check_early_stopping

check_early_stopping(
    model: Any, eval_set: Any | None = None
) -> None

Validate early-stopping configuration before fit().

Parameters:

Name Type Description Default
model Any

a CatBoost / CBX model instance.

required
eval_set Any | None

the eval_set you intend to pass to fit().

None

Raises:

Type Description
CBXError

when any early-stopping parameter is set but eval_set is missing.

feature_importance

feature_importance(
    model: Any,
    data: Any | None = None,
    *,
    type: ImportanceType = "PredictionValuesChange",
    prettified: bool = True,
) -> pd.DataFrame | npt.NDArray[Any]

Return feature importances with feature names attached.

Parameters:

Name Type Description Default
model Any

a fitted CatBoost model (or CBXClassifier/CBXRegressor).

required
data Any | None

pd.DataFrame, catboost.Pool, or None (for types that don't require it). "ShapValues", "LossFunctionChange", "Interaction" require non-None data and will raise CBXError if missing.

None
type ImportanceType

CatBoost importance type.

'PredictionValuesChange'
prettified bool

when True (default), return a sorted pd.DataFrame with columns ["feature", "importance"]. When False, return the raw ndarray from CatBoost.

True

Returns:

Type Description
DataFrame | NDArray[Any]

DataFrame or ndarray depending on prettified.

Raises:

Type Description
CBXError

when data is required but not provided, or when the underlying CatBoost call fails for a reason we recognize.

shap_values

shap_values(model: Any, data: Any) -> pd.DataFrame

Compute SHAP values and return them as a named DataFrame.

Parameters:

Name Type Description Default
model Any

a fitted CatBoost / CBX model.

required
data Any

pd.DataFrame or catboost.Pool — the rows to explain.

required

Returns:

Type Description
DataFrame

DataFrame with one column per feature plus expected_value (the model

DataFrame

bias / base value) as the last column. Rows correspond to the input rows.

Raises:

Type Description
CBXError

if data is None, if the model is multiclass (not supported in v0.1), or if CatBoost reports the well-known "non-zero approx for zero-weight leaf" misconfiguration.

catboost_utils.io

Lossless save/load with metadata sidecar. See SPEC.md §Module 7.

load

load(
    path: str | PathLike[str], *, format: str = "cbm"
) -> CatBoost

Load a model previously saved via save().

Parameters:

Name Type Description Default
path str | PathLike[str]

model file path.

required
format str

passed to CatBoost's load_model.

'cbm'

Returns:

Type Description
CatBoost

A fitted CatBoostClassifier / CatBoostRegressor / CatBoost

CatBoost

instance with metadata reattached when a sidecar is present.

save

save(
    model: Any,
    path: str | PathLike[str],
    *,
    format: str = "cbm",
) -> None

Save model to path and write a metadata sidecar.

Parameters:

Name Type Description Default
model Any

a fitted CatBoost / CBX model.

required
path str | PathLike[str]

target file path.

required
format str

forwarded to model.save_model(format=...)"cbm" (default), "json", "onnx", etc.

'cbm'

catboost_utils.callbacks

Safe callback wrapper. See SPEC.md §Module 8.