Boosting specific model

Boosting specific model

I was wondering if there is way to only build extreme gradient boosting models. I am trying to compare a DataRobot tuned xgboost model to a regular one. The optimal SVM model seems to be overfitting and always predicts 0 and never guesses the positive class. Would this be due to the imbalance of the data? What do you recommend?

0 Kudos
1 Reply
dalilaB
DataRobot Alumni

Where is the regular xgboost?

One thing you can do is external baseline comparison.

Step 1: For the model you created outside Datarobot, you need to add Paritition feature with 2 levels (T, and H) for instance, where T stands for training set, and H, holdout or testing set.  The prediction values should be in another feature: let's call it xgboost_Output
Step 2: Upload the new dataset to DataRobot project

Step 3:  go to Advance Options and click External Predictions

dalilaB_1-1638560242600.png

Then select the feature columns with your external model prediction xgboost_output (for instance)

 

dalilaB_2-1638560271385.png

 

Step 4: Set Partition Features

dalilaB_3-1638560417959.png

 Step 5: For Auto-Pilot mode choose manual. The model will be added to the repository.  Run Task

dalilaB_1-1638562879508.png

 


Step 6: Now, you can run autopilot quick  mode for instance you should have Datarobot build models with your datasets, using your partition.

 
 

 

 

The project will have your model and all other models, you can choose other models to run.

You stated that SVM is always predicting 0, what is there a percentage of each class in the dataset?  If the minority is less than 20% try to downsample.  Downsampling does help in these conditions and will also improve your results.  

Without having access to your data or to the insight of your data, if your prediction is only 0, then it is most likely that you have extreme unbalance.  

dalilaB_0-1638561896483.png