cancel
Showing results for 
Search instead for 
Did you mean: 

Quick Start - not able to move forward after import of data

kprasad
Blue LED

Quick Start - not able to move forward after import of data

I imported the data set in the Quick Start - the import step shows all Green. However I am not able to move forward - unable to figure out what is the issue!

0 Kudos
26 Replies
dalilaB
Data Scientist
Data Scientist

Can you please try to refresh your screen?

In most cases, it can be internet issue.

shaz13
Data Scientist
Data Scientist

@kprasad - To be able to Quick Start the project you would have to select the target. You would then see the start button ready to start AutoML. 

If still you dont see it active. Please try and refresh as suggested by @dalilaB above. If this happens often please make sure you use Google Chrome as that is preferable browser. 


Feel free to reach out if you are still having problems starting the project. Thanks

belen.sanchez
Data Scientist
Data Scientist

hi @kprasad the only thing I will add here is that if you are working on an unsupervised learning project, either clustering or anomaly detection, you do not need to select a target variable.  You can click on a button called 'no target' right under the box where you are supposed to enter the target name and move forward from there. 

zsfeinstein@yahoo.com
Linear Actuator

I am also having problems being able to click on the Start. Refreshing does not seem to do anything. Also, can we be recommended a sample data file to practice on?

0 Kudos
dalilaB
Data Scientist
Data Scientist

What type of dataset would you like, or what would you like to do (Cluster or Classify)?

 

0 Kudos

zsfeinsteinyahoocom_0-1657209575156.png

 

0 Kudos

Actually neither as the picture hopefully shows how I am starting with a simple regression. My Dependent or Response variable here is OrderWeight - and it is a continuous/numeric field. One idea that I am thinking could be the culprit here is the InvoiceDate field. Typically when it is read in via Praxata it sees it as a string field to be cast as Date in one of the Praxata steps. I think that the date was autodetected here shortly after reading in the .csv. Am thinking maybe that the model is waiting for me to absolutely declare it as Date??? but just an idea.

0 Kudos
dalilaB
Data Scientist
Data Scientist

The reason it is not going because it is assuming it is a time-series regression model so, click on automated time series, and follow the steps.   Please see further help in here

0 Kudos
dalilaB
Data Scientist
Data Scientist

If it is a simple Regression, just remove the date from What is the primary date , and then the Start button will become responsive.

0 Kudos

I tried removing the InvoiceDate already hoping that it would Trigger the Start, but it did not. 

But the instructions on my screen are different from what you suggest. Below is a picture of my available options for TS forecasting & nowcasting.

zsfeinsteinyahoocom_0-1657211483429.png

 

If I next click on the Forecasting button then it brings me to:

 

zsfeinsteinyahoocom_1-1657211654358.png

 

It does not permit me to use the Truck as my series name.

But if I click on Show Advanced Options it looks pretty different from the provided example:

zsfeinsteinyahoocom_2-1657211816557.png

 

That specifically was where I got the idea of my date not being formatted correctly.

0 Kudos
dalilaB
Data Scientist
Data Scientist

In advance option go to Time Series, and let's see what shows up (Print screen and share it here)

0 Kudos

zsfeinsteinyahoocom_0-1657214931987.png

 

0 Kudos

zsfeinsteinyahoocom_0-1657215587943.png

Am also thinking that perhaps my InvoiceDate is faulty because of the gaps in there. It is not continuous, and it is mainly weekdays.

Just an idea...

0 Kudos
dalilaB
Data Scientist
Data Scientist

If you have weekdays mainly, go to first AI Catalog, upload your dataset to AI Catalog, and then you will see a humberger on your top right, click on it, and then choose prepare dataset for time-series, else:
If the invoiceDate is not continuous, when choosing TS, choose row instead of date base.
Here is an example:
After deciding on the date, go to Automated Time Series, and if you have series add the series, but then 

Screen Shot 2022-07-07 at 1.46.36 PM.png

 

If you get an error, choose row.  You can also click on Time Series Data Prep which will take you to AI Catalog where the data can be cleaned. 

Screen Shot 2022-07-07 at 1.47.53 PM.png

0 Kudos
zsfeinstein@yahoo.com
Linear Actuator

Wow is getting a little exciting. Thank you. Unfortunately the devil resides is some details. Please see the following screen shot:

 

zsfeinsteinyahoocom_0-1657220085226.png

 

This is what the interactive menu showed before I did anything related to the Series ID. Having explored the documentation it looks like maybe I should have used my Truck field for the Series ID, but am not sure. Please advise here.

Some other interesting tidbits + features (pun intended) follow:

  1. You can see that I used the Mean and Most Recent value for the Target Imputation. That looks slightly more Kosher than the SUM & Zero.
  2. For the Categorical Feature Imputation I set it to "most frequent." That seems like a better, more Kosher choice, than the last option.
  3. And my last present observation is that the number of Rows in this revised dataset is equal to the number of rows between my min & max.

I am next going to review what I think is the new dataset within the AI Catalog that is more suitable for TimeSeries analyses.

Again, DataRobot is very good at removing the coding aspect that I am accustomed to over the course of many weeks/months. Miss it a bit though... Imputation can be fun!

0 Kudos
zsfeinstein@yahoo.com
Linear Actuator

So I was able to see the datafile within the AI Catalog. I thought I was being clever in downloading the file from the hamburger, but I think it just downloads the original .csv file again. Below are some screen-shots of the file residing in the AI Catalog:

zsfeinsteinyahoocom_0-1657223646973.png

Again I downloaded it from the hamburger in the upper-right:

zsfeinsteinyahoocom_1-1657223757005.png

I renamed the downloaded dataset to something "Interpolated."

zsfeinsteinyahoocom_2-1657223861344.png

But where should I retrieve this data from?

0 Kudos
dalilaB
Data Scientist
Data Scientist

Now, just click on create a project.  You don't need to download the clean dataset

0 Kudos
zsfeinstein@yahoo.com
Linear Actuator

zsfeinsteinyahoocom_0-1657227190652.png

This is where I am currently at. Am a little embarrassed that I am drawing a blank on what to do next.

Please take an easy look at some of my other questions within this thread for other specific areas such as the need, or not, of defining the Series and where/when should that occur, as well as how to check the quality of my work through the many steps.

0 Kudos
dalilaB
Data Scientist
Data Scientist

If you are just forecasting order weight irrelevant of a truck, than you don't have a series, and you should just ignore the truck series id.  So, when filling Prepare dataset for Time-series, just don't fill the series id section.  
Here are the three steps:
1.  Fill the form for prepare dataset for time series

Screen Shot 2022-07-08 at 7.45.12 AM.png

This is what you will get, as you notice a suffix was added at the end of the original dataset name, and it is still registering.  

 

 

 

 

whenNow that the dataset is registereted, just click on create ProjectScreen Shot 2022-07-08 at 7.46.28 AM.pngScreen Shot 2022-07-08 at 7.47.08 AM.png

0 Kudos
dalilaB
Data Scientist
Data Scientist

 I just checked and you do have premium Subscription to DRU.  Please take advantage of it, as the courses are hands on.

0 Kudos

I completely agree that doing the DRU courses will be very valuable. A problem that I am having is with some of the basic or beginning labs:

zsfeinsteinyahoocom_0-1657336134294.png

This is very early in the instructions.

Even my graph in the upper-right is different from the instructions:

zsfeinsteinyahoocom_1-1657336191458.png

 

The following snapshot of the Time-Aware modeling sets the Default prediction window to 7 days. I set it to 30 per the instructions and receive the following error:

zsfeinsteinyahoocom_3-1657336371557.png

I still get the same error when setting the window to 7 days instead of 30.

zsfeinsteinyahoocom_2-1657336268643.png

 

 I normally would not care a lot about minor differences especially with different versions, but I worry how difficult it will be to follow the lessons after even having problems with the  early ones. Am currently thinking instructor-led courses may be an ideal solution? Please advise on the best strategy, and this is a summary of why I wish to try and bypass some of the DRU offerings. Please let me know if or if not this is the correct forum for discussing such issues.

 

0 Kudos

NEVER MIND - I THINK I FIGURED WHAT I DID INCORRECTLY - Tring to delete my previous message

0 Kudos
jenD
DataRobot Employee
DataRobot Employee

zsfeinstein@yahoo.com would you like me to delete the previous 2 posts (the one about DRU feedback and the one saying you are trying to delete it)? Not suggesting to do so, but if you'd like me to, I'd be happy to.

0 Kudos

Yes please delete them if it is easy enough for you. I apologize for not continuing to push myself enough.

0 Kudos