Showing results for 
Search instead for 
Did you mean: 

Data Robot cannot recognize Summarized Categorical from CSV

NiCd Battery

Data Robot cannot recognize Summarized Categorical from CSV

Dear community!


I'm playing around with summarized categorical this time. When I'm uploading my CSV file using Data Robot UI, it's not recognizing the column I've prepared as Summarized Categorical.


According to documentation:

The following is an example of a valid summarized categorical column:

{“Book1”: 100, “Book2”: 13}


My file sample looks like this:


86,b159390580d2385a663fb8f9b69de286,"{""test1"": 12, ""test2"": 276}"
358,f0655b4329ba69ea1494f2d71d5f86ab,"{""test1"": 34, ""test2"": 2}"
173,55dd2abf26a8e20059430cf5de0dfd2f,"{""test1"": 192, ""test2"": 0}"
83,0414a41c5978220fd65386ef9039a18b,"{""test1"": 224, ""test2"": 109}"
348,8bcd8d8bf276b943f864ff33a480c790,"{""test1"": 343, ""test2"": 333}"
34,4c08022a53a879254e1a233033807282,"{""test1"": 65, ""test2"": 19}"
316,8733faee82efb74740941aec5f1eccac,"{""test1"": 11, ""test2"": 280}"
215,6364ccddefbb437c63427dfb00a350b7,"{""test1"": 23, ""test2"": 233}"
218,f866a8201785ee1d46811c87e9bd28ea,"{""test1"": 12, ""test2"": 276}"
51,772f0dc5fad949abd9747e299ffad24c,"{""test1"": 98, ""test2"": 777}"
50,538690be409441ee431e54440f052748,"{""test1"": 12, ""test2"": 276}"



My best guess is that it happens because of quote char escaping, but I have to do it to make this CSV parsable. Also if you look at the raw data in the Data Robot's UI it looks valid JSON.



Do you have any ideas about what could be wrong?


3 Replies
Data Scientist
Data Scientist

According to the documentation, a summarized categorical must have a numeric value that is greater than zero.  You have zero values in your csv.  Your format was correct, you just included some invalid rows, which then converts the feature type to a categorical.



 Here is the code that I used to clean and test...


df[[' 0' not in s for s in df.SumCat]].to_csv('community_q&a_5-7-21_v2.csv', index=False)
0 Kudos

Thanks a lot! I missed this part

0 Kudos
DataRobot Employee
DataRobot Employee

The issue is this value as summarized categorical only allows positive integers or floats, so zero is invalid


"{""test1"": 192, ""test2"": 0}"


0 Kudos