cancel
Showing results for 
Search instead for 
Did you mean: 


Group Partitioning for Time Dependent Data

KyleMB
Blue LED

Hi, 

 

I'm currently working with an imbalanced dataset (ratio of about 25:1), where there is a time component as well (like trying to predict when a working system or client has an undesirable result). I was considering doing a group partition, my question is, does group partition shuffle the groups randomly or how would I manually make a group that contains the most recent time period of data be in the holdout group? Does Time aware modelling automatically do this?

 

Thanks,

 

Kyle

Labels (2)
1 Reply
IraWatt
Micro Servo

Hey @KyleMB,

Group partitioning ensures that all members of a group fall within the same partition. For instance if your data had a sate ID you could ensure that each state is validated independently of each other (Cross-Validation example shown below). 

IraWatt_1-1632321086781.png

Your problem sounds as though you are predicting the target value on each individual row rather then forecasting (more info on types of time based modelling here). In this case out-of-time validation (OTV - Date/time partitioning) would be the most appropriate way to validate/partition your data.  OTV will ensure that the validation and holdout sets contain the most recent time period. 

 

If this post answers your question feel free at accept as a solution to help others find the information.

 

All the best,

Ira