Paxata

so
NiCd Battery

Paxata

so_0-1636563499982.png

 

Hi, 

I am working on order date column. few dates are missing and I want to add missing  dates rows by  filling other features with 0 value.
I will appreciate in this regards

4 Replies
akshay
DataRobot Alumni

Can you elaborate on the filling other features with 0 value statement? I am not sure I follow. Filling missing dates for the blank values can be done in multiple ways. You can use the filtergram(As shown in your screenshot) and select the blank date values on the filtergram, then perform a find and replace on them. You can also fill up or fill down on the date value using our fill operation found on the dropdown menu near the column name.

For example 

Order date           Loads

11/10/2021,         5

11/11/2021,         8

11/13/2021,         9

11/15/2021,         4

I want to fill the missing dates like 11/12/2021,  11/14/2021 with 0 Loads value

Hi, 

I have two questions :

Order date           Loads  Customer

11/10/2021,         5          Walmart

11/11/2021,         8          Walmart

11/13/2021,         9         Walmart

11/15/2021,         4         Walmart

1) I want to fill the missing dates like 11/12/2021,  11/14/2021 with 0 Loads value for same customer.

2 ) Also, I want to create new feature(6Weeks_Moving average), and have to calculate 42 days  moving average. 

 

I will appreciate if you can provide the sample of the both solutions.

Thank you!

Hi @so ,

 

First you'll want to use "shift" to get the rows of missing dates you will need to create.

 

1) Window > Shift

calamari_0-1636939936746.png

2) Shift Up on the Dates column by 1 row

calamari_1-1636939974364.png

 

3) Create a computed column to calculate the rows you will need to create (notice the minus 1, because if it's only 1 day apart we don't need to create any row):

calamari_3-1636940184416.png

 

4) Using another compute column convert the number into an array (note that you need to create this "if statement" to cater for the maximum number of values you have, as there is no looping in compute column statements):

calamari_4-1636943202128.png

5) This next compute column could be incorporated in the previous if statement, so i am actually going to EDIT the above, but wanted to leave it there for reference... SO I am going to change the above to this (slightly more complex logic):

calamari_1-1636949115509.png

 

6) This next step is trivial but important. Of course the last row is not going to have a valid value, as it has no valid "next row" to look at, so we need to remove the error value from the cell. Double click on the cell with the red warning dot, to replace it with nothing.

calamari_3-1636949326254.png

7) I need to split this column into separate columns now:

calamari_7-1636944867620.png

calamari_2-1636949166156.png

 

7)depivot this column to create the rows I want

calamari_0-1636948463227.png

😎 Remove the blank rows:

calamari_4-1636949452315.png

9) (optional) change the date to date format and remove the columns you don't need, resulting in:

calamari_5-1636949574651.png

 

 

On to your next question, there are two things you need to do:

1) Create a constant:

calamari_6-1636950522144.png

 

2a) Window > Aggregate 

calamari_8-1636950568676.png

2b) Use the Constant column to Group the records, and then the offset you use will be 42 (i used 3 because my dataset was smaller).

calamari_7-1636950554498.png

 

Hope that helps!

 

Callum