Validation¶
catboost_utils.validation runs cheap pre-flight checks against your training data, returning a structured report you can inspect or assert on.
from catboost_utils import validate
report = validate(
X, y,
cat_features=["city"],
eval_set=(X_val, y_val),
model_params={"task_type": "GPU"},
)
report.ok # bool
report.issues # blocking
report.warnings # non-blocking
report.raise_if_failed()
Blocking issues¶
- empty DataFrame
- target with NaN or only one unique value
- categorical features with float dtype
- NaN in object-dtype columns not declared as cat
- inf / -inf in numerical columns
Warnings¶
- undeclared object/category/string/bool columns
- low-cardinality int columns (likely categorical)
-
50% NaN ratio
- constant columns
- datetime / timedelta dtype
- column mismatch between train and eval
- both
class_weightsandauto_class_weightsset task_type='GPU'orthread_count > 1(non-bitwise reproducibility)
catboost_utils.validation.runner.validate ¶
validate(
X: DataFrame,
y: Series | None = None,
*,
cat_features: list[str | int] | None = None,
eval_set: tuple[DataFrame, Any] | None = None,
model_params: dict[str, Any] | None = None,
) -> ValidationReport
Run pre-flight checks against a training DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
DataFrame
|
feature DataFrame (a |
required |
y
|
Series | None
|
target series. Optional — when |
None
|
cat_features
|
list[str | int] | None
|
list of categorical column names or positional indices. |
None
|
eval_set
|
tuple[DataFrame, Any] | None
|
optional |
None
|
model_params
|
dict[str, Any] | None
|
optional dict of CatBoost params for cross-cutting warnings
( |
None
|
Returns:
| Type | Description |
|---|---|
ValidationReport
|
|
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
when |
TypeError
|
when |
catboost_utils.validation.models.ValidationReport ¶
Bases: BaseModel
Full result of validate().
catboost_utils.validation.models.ValidationIssue ¶
Bases: BaseModel
Blocking problem — training will (almost certainly) fail.
catboost_utils.validation.models.ValidationWarning ¶
Bases: BaseModel
Non-blocking concern — training proceeds but the user should be aware.