Hi all -
Did a quick scan of all my projects (264 total - yikes)
- Anything with images didn't have SVM
- Multi-label didn't have SVM (i.e. a row can have more than one class)
- Multi-class didn't have any SVM
- pretty much everything else I checked (but not all!) had at least one SVM in the Repository
I can easily imagine other options as impacting models used - Feature Constraints (e.g. "Include only monotonic models"), Only Blueprints with Scoring Code, Only models with SHAP.
Please note this is not intended to be an all inclusive list of reasons SVMs are excluded - some of my projects had 100's of classes and/or 1000's of features. It be that some other constraint "kicked out" SVMs.
Rather I point out that there are many considerations made as to exactly what algorithms are included.
@cryptoman I don't believe the dataset itself matters in this case. From what I've tried creating a model with just one numeric feature and creating with 30 feature set I still get SVM and kNN for classification. It must be something else restricting your repository but I'm not sure what it could be.
There is no such mention in documentation. That is correct. However, models that are made available for a given dataset seem to appear automatically as a pretty short list (that does not include SVM and kNN, in my case) We cannot seem to explore models beyond that shortlist provided automatically by the platform.
The platform seems to be optimized to provide the best performance possible and assumes any model that is likely to underperform will not be interest to the user, which I think is where the problem is. If this is the way it operates, it is not as "research friendly" as one would have liked.
Thanks for the response IraWatt. I am working on classification problems at the moment. I also did a keyword search as you have indicated and could not place the models I was looking for.
This seems to be an interesting limitation. We should be able to experiment with any model we like, as long as the input data is fit for purpose and the target variable can be solved for. At the moment, the only solution for me is to look into alternative platforms/solutions that will give me the flexibility to experiment with models DR does not allow me to work with.
Yes that's right models wont be shown based on type of problem your trying to solve. For instance, in my classification problem I get models designed for classification. What type of problem are you trying to solve?
For classification DR does seem to have KNN and SVM (image showing model->repository->search)
I have a related question.
Does this mean the majority of the DR models are not even listed based on the project data? For example, in my project I would like to try SVM but it is not listed anywhere in the repository. What is the reason for that? I would also like to try kNN model with my dataset but that is also not listed as an available model either.
Is there a way to access any model available in DR to train/test with our dataset?
Dear @ks0420 ,
I see that @IraWatt already shared some informative replies about the number of models/blueprints that DataRobot supports.
We recently updated the Platform Documentation that describes the kinds of models that DR supports, take a look at the following direct link to the platform docs:
Snippet from the platform docs:
DataRobot supports Tree-based models, Deep Learning models, Support Vector Machines (SVM), Generalized Linear Models, Anomaly Detection models, Text Mining models, and many more.
Hope this answers your question!
Another thing to note is that new blueprints are dynamically created when running a project so the actual number of possible models is not really countable.
Hope some of that was useful : )
To count the number of models you could use the API or manually count them in the repository. For instance, using the API you can count the number of default blueprints:
A slight issue is that this method only lists models for the type of problem the project is addressing. For instance, this project is a classification problem and it return 78 default blueprints. I would also note blueprints in DR include pre-processing steps as part of the model which you might not want factor in when counting distinct models.
List of models in console:
DR can be used for several different types of problems so I'd guess the number of default blueprints would be 200+. You can also count models in the UI here: