My OTV project includes multiple datasets. After running autopilot I see I have many more features (looking at the Informative Features) and that some are named “entropy” … like “(60 days entropy)”
What are these ‘entropy’ feature(s) that DataRobot derived? How should I use them?
Solved! Go to Solution.
It's a measure of how messy and unpredictable an array of values is (e.g. how diverse is a categorical column). If all values were the same, entropy would be 0, if all values were different from each other, entropy would be 1. It's therefore indeed a measure of predictability, or order in the data. It can be applied to categoricals where there's no ordering defined
Note that at this point you can use them as any other feature - but since DataRobot derived them, you will not need to provide them should you choose to deploy a model that leverages them. DataRobot will derive them during a scoring request, as long as you provide the input features associated with them.
@vyas.adhikari @doyouevendata appreciate your help