Hi all,
How can we avoid the error below when starting a big autopilot project?
Thank you,
Giorgio
#Start a DataRobot Project
partitioning = dr.StratifiedCV(holdout_pct = 20, reps = 10)
project = dr.Project.start(data, #Pandas Dataframe with data. Could also pass the folder link itself
project_name = 'Pj_name ',#Name of the project
target = 'target, #Target of the project
worker_count = -1, #Amount of workers to use. -1 means all available workers
autopilot_on = True,
partitioning_method = partitioning) #Run on autopilot (Default value)
Error:
AsyncTimeoutError: Client timed out in 600 seconds waiting for https://app.datarobot.com/api/v2/status/7cc2310f-c833-44ae-9691-573d29e46264/ to resolve. Last status was 200: {"status": "RUNNING", "created": "2022-06-22T23:18:15.916189Z", "message": "Finishing dataset upload", "code": 0, "statusId": "7cc2310f-c833-44ae-9691-573d29e46264", "description": "", "statusType": ""}
Here is a function provided by one of my colleague
def create_from_in_memory_data2(data_frame=None, records=None, categories=None, read_timeout=600):
buff = dataframe_to_buffer(data_frame)
return dr.Dataset.create_from_file(filelike=buff, categories=categories, read_timeout=read_timeout)
pandas_dataset = create_from_in_memory_data2(data_frame=df, read_timeout=1000000)
Hi Giorgio,
Thanks for your question. It looks like you're getting a timeout error due to dataset size. How large is the dataset you're using here?. There is a workaround in dr.Project class to address this as below.
In step 1, you can use dr.Project.create() - (API Doc) to upload the dataset. There is a max_wait parameter which is set to 600 by default. You can increase it depending on the dataset size.
Once the data upload is successful you can use dr.Project.set_target() - (API Doc) to start your project. This function takes all the inputs like partition method etc.
It is recommended to use create() and set_target() to use all the advanced parameters. start() is a quickstart option but doesn't use all the advanced parameters.
Let me know if this solves the timeout error you're running into.
Thanks,
Kreshnaa