cancel
Showing results for 
Search instead for 
Did you mean: 

time data not available showing too many values

time data not available showing too many values

I imported a data set with a field with "timestamp" data in seconds as shown below. I would like to do time aware modelling. However, datarobot could not detect any time features in the dataset. Later, I found in the feature list that, datarobot automatically eliminated "timestamp" saying, [too many values, Reference ID] due to which, I am not able to do timeaware modelling. Any help in solving this issue will be highly appreciated.

00:00:01
00:00:02
00:00:03
00:00:04

Labels (2)
0 Kudos
7 Replies
Lukas
Data Scientist
Data Scientist

Hi @nsahoo3 ,

Could you try what happens when you add a date to your time, like "2020-04-20T00:00:01"?

 

That should do the trick.

Cheers,

Lukas

Hallo nsahoo3

That is a great question!

One reason why DataRobot was not considering this column as a time column is its format. Try changing the format of the values in this column to YYY-MM-DD HH:MM or YYYY-MM-DD HH:MM:SS.

Thanks,

Thanks @Lukas and @rtungaraza for the solution. Now I have my timestamp in  format (see below). Even now, DR cannot identify time feature. The timestamp column is still not available as a feature saying, it has "too many values". I am stuck at this point unable to do any further analysis.

06-26-2015 00:00:01
06-26-2015 00:00:02
06-26-2015 00:00:03
06-26-2015 00:00:04

0 Kudos

Hi @nsahoo3 ,

 

Could you please do one more try, where you change from this:

06-26-2015 00:00:01

to this:

2015-06-26T00:00:01

The date format you've chosen is not recognised by DataRobot

Cheers,

Lukas

0 Kudos

Thanks @Lukas for the the solution. It looks like the problem is with something else. DR is not able to take the time as a feature because of which it is not recognized during Time-aware modelling. DR is eliminating the time feature saying too many values. 

I tried another file with timestamp in the following format and it worked. Now I need to solve this "too many values" issue.

00:00:01
00:00:02
00:00:03

 

0 Kudos

@nsahoo3 Yes, the problem here is the number of time-steps. If you include fine-grained time information, then the step size is going to be at the most granular level represented in your data (seconds).

What granularity of data do you want to model? Daily, hourly, minute, second, etc? You may need to aggregate your data into your granularity of interest, split your date-time column into a date and time columns (if you want daily or larger granularity), or reduce the duration of your training data for building the initial project (if you want sub-day granularity).

0 Kudos

Thanks @jarred for the feedback. I would need granularity level as seconds since the data changes every second. I will reduce the duration and see how it looks.

0 Kudos