Categoricals treated differently in Logistic Regression
I had this observation recently: for logistic regressions, if a categorical feature has two levels, only one level is presented in the coefficients table; while for categorical features with more than 2 levels, all the levels are shown in the coefficients table. Any idea why?
Just heard back from a data scientist at DR - DR's logistic regression follows how Sklearn one hot encode categorical variables. For binary categorical variable, one level is dropped; while for variables with more than two levels, all levels are kept.