'super' object has no attribute '__sklearn_tags__'
Categories:
Resolving 'super' object has no attribute 'sklearn_tags' in Scikit-learn and XGBoost

Understand and fix the common 'super' object attribute error when integrating custom estimators or using XGBoost with Scikit-learn pipelines.
When working with machine learning models in Python, especially when combining powerful libraries like Scikit-learn and XGBoost, you might encounter the error AttributeError: 'super' object has no attribute '__sklearn_tags__'
. This error typically arises when Scikit-learn expects a certain interface or set of attributes from an estimator, but the object it receives (often a custom wrapper or an XGBoost model) doesn't fully conform to that expectation. This article will delve into the root causes of this error and provide practical solutions to ensure smooth integration of your models.
Understanding the 'sklearn_tags' Attribute
Scikit-learn uses a system of 'tags' to understand the capabilities and characteristics of an estimator. These tags are internal attributes, often defined within the _get_tags()
method of an estimator, and they inform Scikit-learn about things like whether an estimator can handle missing values, sparse data, or multi-output targets. When Scikit-learn tries to access __sklearn_tags__
on an object that doesn't have it (or doesn't inherit from a class that properly defines it), the AttributeError
is raised. This usually happens when you're trying to use a non-Scikit-learn compatible object in a context where Scikit-learn expects one, such as within a Pipeline
, GridSearchCV
, or other meta-estimators.
flowchart TD A[Scikit-learn Meta-Estimator (e.g., Pipeline)] --> B{Check Estimator Compatibility} B --> C{Does Estimator have '__sklearn_tags__'?} C -- No --> D[AttributeError: 'super' object has no attribute '__sklearn_tags__'] C -- Yes --> E[Proceed with Scikit-learn Operations] D -- Solution --> F[Wrap Estimator with BaseEstimator/TransformerMixin] D -- Solution --> G[Ensure Proper Inheritance for Custom Estimators]
Flowchart illustrating the cause and potential solutions for the 'sklearn_tags' error.
Common Scenarios and Solutions
The __sklearn_tags__
error frequently appears in two main scenarios: when integrating XGBoost models directly into Scikit-learn pipelines, and when developing custom estimators that don't properly inherit from Scikit-learn's base classes.
Pipeline
or GridSearchCV
.Scenario 1: XGBoost Models in Scikit-learn Pipelines
XGBoost models (e.g., XGBClassifier
, XGBRegressor
) are designed to be largely compatible with Scikit-learn's API. However, older versions or specific usage patterns can sometimes lead to this error. The most robust solution is to ensure you are using a recent version of XGBoost and, if necessary, explicitly wrap your XGBoost model to ensure full Scikit-learn compatibility.
import xgboost as xgb
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# Generate some dummy data
X, y = make_classification(n_samples=100, n_features=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
# Correct way to use XGBoost in a Scikit-learn Pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('xgb_model', xgb.XGBClassifier(random_state=42, use_label_encoder=False, eval_metric='logloss'))
])
pipeline.fit(X_train, y_train)
score = pipeline.score(X_test, y_test)
print(f"Pipeline score: {score:.4f}")
Correctly integrating XGBoostClassifier into a Scikit-learn Pipeline.
XGBClassifier
and XGBRegressor
, ensure you pass use_label_encoder=False
and specify an eval_metric
to suppress deprecation warnings in newer XGBoost versions.Scenario 2: Custom Estimators Lacking Scikit-learn API Compliance
If you're building your own custom estimator (e.g., a custom transformer or a custom model), you must ensure it adheres to Scikit-learn's estimator API. This primarily involves inheriting from sklearn.base.BaseEstimator
and, for transformers, sklearn.base.TransformerMixin
, or for classifiers/regressors, sklearn.base.ClassifierMixin
/sklearn.base.RegressorMixin
respectively. These base classes provide the necessary methods (fit
, transform
, predict
, score
, get_params
, set_params
) and, crucially, the _get_tags()
method that defines __sklearn_tags__
.
from sklearn.base import BaseEstimator, TransformerMixin
import numpy as np
class CustomFeatureScaler(BaseEstimator, TransformerMixin):
def __init__(self, scale_factor=1.0):
self.scale_factor = scale_factor
def fit(self, X, y=None):
# In a real scaler, you might compute min/max or mean/std here
self.is_fitted_ = True
return self
def transform(self, X):
if not hasattr(self, 'is_fitted_'):
raise RuntimeError("Estimator not fitted. Call fit() first.")
return X * self.scale_factor
# Scikit-learn's BaseEstimator provides _get_tags() automatically
# No need to explicitly define __sklearn_tags__ if inheriting properly
# Example usage in a pipeline
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
X, y = make_classification(n_samples=100, n_features=5, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
pipeline_custom = Pipeline([
('custom_scaler', CustomFeatureScaler(scale_factor=2.0)),
('classifier', LogisticRegression(random_state=42))
])
pipeline_custom.fit(X_train, y_train)
score_custom = pipeline_custom.score(X_test, y_test)
print(f"Custom Pipeline score: {score_custom:.4f}")
Example of a custom Scikit-learn compatible transformer inheriting from BaseEstimator and TransformerMixin.
fit
, predict
(for models), transform
(for transformers), get_params
, and set_params
methods correctly. Inheriting from BaseEstimator
handles get_params
and set_params
automatically.Debugging and Verification
If you're still encountering the error, here are some steps to debug:
- Check Inheritance: Verify that your custom estimator or wrapper class correctly inherits from
sklearn.base.BaseEstimator
and other relevant mixins. - Version Compatibility: Ensure your Scikit-learn and XGBoost versions are compatible. Sometimes, an older version of one library might not fully support the latest API changes of the other.
- Inspect the Object: Before passing your estimator to a Scikit-learn utility, inspect it. You can try
hasattr(your_estimator, '_get_tags')
or evenyour_estimator._get_tags()
to see if the method exists and returns expected tags. If_get_tags()
is missing, that's your problem. - Minimal Reproducible Example: Create a minimal code snippet that reproduces the error. This helps isolate the problem and makes it easier to find a solution.
1. Verify Base Class Inheritance
Ensure your custom estimator class explicitly inherits from sklearn.base.BaseEstimator
and appropriate mixins like TransformerMixin
or ClassifierMixin
.
2. Update Libraries
Update Scikit-learn and XGBoost to their latest stable versions to benefit from bug fixes and improved compatibility.
3. Check _get_tags()
Implementation
If you're not inheriting from BaseEstimator
, manually implement a _get_tags()
method that returns a dictionary of relevant tags. However, inheriting is the preferred and more robust approach.
4. Test Estimator Independently
Before integrating into a pipeline, test your custom estimator's fit
, transform
/predict
methods independently to ensure they function as expected.