Hello all,
I have been able to successfully run a time series model through the python API with relative ease:
error{"message": "Column DT has not been analyzed for time series modeling with the specified multiseries id columns ['Source']."}
---
For reference, my test data looks like this:
DT Source Value
01-01-2000 f1 1
01-01-2000 f2 100
01-02-2000 f1 2
01-02-2000 f2 105
...
You should add a step in order to wait for 200 code. Because time partitioning could take some time, if you send the command before it is completed it produces error. Check the code below. Each STP must be applied separate steps in REST API.
STP
multiseries(POST)
BODY-Json
{
"datetimePartitionColumn": "PERIOD",
"multiseriesIdColumns": ["SERIAL_ID"]
}
STP
getresponse(GET)
BODY
{
"datetimePartitionColumn": "PERIOD",
"multiseriesIdColumns": ["SERIAL_ID"]
}
TEST
{
var project_id = pm.variables.get("projectId");
var res = JSON.parse(responseBody);
if (pm.response.code === 200){
detectedMultiseriesIdColumns = res.detectedMultiseriesIdColumns.length;
if (detectedMultiseriesIdColumns =="1" ) {
postman.setNextRequest("run");
console.log("OK");
}
else{
console.log(detectedMultiseriesIdColumns.length);
setTimeout(function(){},[20000]);
postman.setNextRequest("multiseries");
}
}
else{
setTimeout(function(){},[20000]);
postman.setNextRequest("getresponse");
console.log("getrestrepeat");
}
STP
run (PATCH)
BODY
{
"target": "TARGET",
"mode": "quick",
"featureDerivationWindowStart": -6,
"featureDerivationWindowEnd": 0,
"forecastWindowStart": 1,
"forecastWindowEnd": 3,
"numberOfBacktests": 2,
"useTimeSeries": true,
"datetimePartitionColumn": "PERIOD",
"multiseriesIdColumns": [
"SERIAL_ID"
],
"cvMethod": "datetime",
"blendBestModels": false,
"windowsBasisUnit": "MONTH"
}
Hi, we also had the same error. How could we train timeseries model via API. Although we set the "useTimeSeries":true it behaves as if regression.
In addition, I have printed out the response from the multiseriesProperties post and it seems to be a success, despite still leading to the original error when calling the datetimePartitioning post.
HTTP/1.1 202 ACCEPTED [Date: Wed, 13 Jul 2022 04:38:31 GMT, Content-Type: text/html; charset=utf-8, Content-Length: 0, Connection: keep-alive, Server: openresty, Location: https://app.datarobot.com/api/v2/status/2e9fc7bf-0c3a-44e7-bd91-1cf37e0ec90b/, Pragma: no-cache, Cache-Control: no-store, x-request-id: e01b988aa7db813d3b284ee3f52c9ecf, Strict-Transport-Security: max-age=16070400; includeSubDomains, X-Frame-Options: SAMEORIGIN, Referrer-Policy: origin-when-cross-origin, X-Content-Type-Options: nosniff, X-XSS-Protection: 1; mode=block, X-DataRobot-Request-ID: e01b988aa7db813d3b284ee3f52c9ecf, Expect-CT: max-age=86400, enforce]
While I do greatly appreciate the effort put forth, I do not understand how that example relates to my specific issue. I am able to make DataRobot REST API calls in general, and am working on partitioning before training. The example does not seem to mention partitioning or training or multiseries properties analysis. It is entirely possible I am missing something here, and I apologize if that is the case. I do appreciate your patience with me thus far.
Here is an example that I have run which returns the error:
error{"message": "Column DT has not been analyzed for time series modeling with the specified multiseries id columns ['Source']."}.
Here is an example
#Get the deployment
def find_or_return_deployment(ts_setting):
deployments = dr.Deployment.list()
found_dep = [x for x in deployments if re.search(ts_setting["deploy_name"],x.label)]
if found_dep:
return found_dep[0]
else:
deployment = dr.Deployment.create_from_learning_model(
model.id, label=ts_setting["deploy_name"], description=ts_setting["deploy_desc"],
default_prediction_server_id=prediction_server.id)
return deployment
#Code for scoring
class DataRobotPredictionError(Exception):
"""Raised if there are issues getting predictions from DataRobot"""
def make_datarobot_deployment_predictions(
data,
deployment_id,
forecast_point=None,
predictions_start_date=None,
predictions_end_date=None,
):
"""
Make predictions on data provided using DataRobot deployment_id provided.
See docs for details:
https://app.datarobot.com/docs/predictions/api/dr-predapi.html
Parameters
----------
data : str
Feature1,Feature2
numeric_value,string
deployment_id : str
Deployment ID to make predictions with.
forecast_point : str, optional
Forecast point as timestamp in ISO format
predictions_start_date : str, optional
Start of predictions as timestamp in ISO format
predictions_end_date : str, optional
End of predictions as timestamp in ISO format
Returns
-------
Response schema:
https://app.datarobot.com/docs/predictions/api/dr-predapi.html#response-schema
Raises
------
DataRobotPredictionError if there are issues getting predictions from DataRobot
"""
# Set HTTP headers. The charset should match the contents of the file.
headers = {
'Content-Type': 'text/plain; charset=UTF-8',
'Authorization': 'Bearer {}'.format(API_KEY),
'DataRobot-Key': DATAROBOT_KEY,
}
url = API_URL.format(deployment_id=deployment_id)
# Prediction Explanations:
# See the documentation for more information:
# https://app.datarobot.com/docs/predictions/api/dr-predapi.html#request-pred-explanations
# Should you wish to include Prediction Explanations or Prediction Warnings in the result,
# Change the parameters below accordingly, and remove the comment from the params field below:
params = {
'forecastPoint': forecast_point,
'predictionsStartDate': predictions_start_date,
'predictionsEndDate': predictions_end_date,
# If explanations are required, uncomment the line below
'maxExplanations': 3,
# 'thresholdHigh': 0.5,
# 'thresholdLow': 0.15,
# Uncomment this for Prediction Warnings, if enabled for your deployment.
# 'predictionWarningEnabled': 'true',
}
# Make API request for predictions
predictions_response = requests.post(url, data=data, headers=headers, params=params)
_raise_dataroboterror_for_status(predictions_response)
# Return a Python dict following the schema in the documentation
return predictions_response.json()
def _raise_dataroboterror_for_status(response):
"""Raise DataRobotPredictionError if the request fails along with the response returned"""
try:
response.raise_for_status()
except requests.exceptions.HTTPError:
err_msg = '{code} Error: {msg}'.format(
code=response.status_code, msg=response.text)
raise DataRobotPredictionError(err_msg)
#Score
#Notice that I saved the dataset to a csv file, so I can read is as binary
filename = "score_data.csv"
data = open(filename, 'rb').read()
data_size = sys.getsizeof(data)
try:
predictions = make_datarobot_deployment_predictions(
data,
ts_setting["deployment_id"],
forecast_point= ts_setting["ForcastPoint"]
)
except DataRobotPredictionError as exc:
print(exc)
#read it as json
result_js = json.dumps(predictions)
#read it to a pandas DataFrame
#get the latest weather date
tp1 = pd.json_normalize(predictions['data'],record_path=['predictionValues'])
tp2_exp = pd.json_normalize(predictions['data'])
Thank you!
I've messed around with this a little now and just want to be clear about the correct process here.
I need to post api/v2/projects/{projectId}/multiseriesProperties/
with the body { "datetimePartitionColumn": "DT", "multiseriesIdColumns": [ "Source" ] }
and then post api/v2/projects/{projectId}/datetimePartitioning/
with the body mentioned in my original post?
This currently still gives me the same error message, so maybe I am still tripping on something here. Maybe I need to link these posts somehow?
Thank you for getting me this far.
You'll need to invoke this multiseriesProperties route to trigger the computation first.
The python client automagically kicks off multiseries analysis, as does the browser app, but if you want to use the REST API it's a few more steps to take.
Source is suggested as a viable option if I go through this whole process in the web interface.
The data tab after this failure does not seem extremely interesting.
Thank you for having a look.
I messed up my example data and will edit it immediately. The source should not change each day.
Hi!
May you show a screenshot from the DataRobot Data tab when it's uploaded to the platform but had not started yet? I would like to see feature types to check for correct recognition by platform. Also, try to set up multi-series in the platform and check if DataRobot suggests "Source" as a viable option for a multi-series id column. Because in your example dataset looks like Source changes each day instead of for each date available multiple Sources.