I am very new to DataRobot and can you help me with below requests please?
1. Are we able to connect to DataRobot from Power BI desktop, if we are can you please help me with steps.
2. I would like to perform a key phrase extraction API in datarobot [ ex : Microsoft Azure cognitive services ], can you please guide me the options with similar API's it would be a great help.
3. What is the best way for guided learning Datarobot.
Thank you very much !!!
looking forward with your answers
- Srikanth G
i would like to take prediction results from datarobot and connect power Bi to that output.csv and generate prediction report. would you be able to show how to do that?
currently looking at this for my ask but not sure what exactly is AnswerSets from Paxata:
Can you please help & provide me more details on how can i generate a token to connect with Power BI Desktop?
To help clarify and note the differences and compliments, here's a solution integrating DataRobot text processing output, Azure key phrase extraction, and use of Power BI for data prep. The Azure key phrase extraction is going to identify the important terms, and Power BI is going to allow you to integrate those terms and then filter/slice based on other values in your dataset to create interesting visualizations. Key phrase extraction is going to give you the terms back returned from an API call. Power Bi is going to be able to digest it, present it, and you can merge it back with your original dataset.
Where DataRobot comes in is to add a layer of associating the of important terms with prediction results. So while Azure returns the important terms given some generalized corpus or context, DataRobot tells you the importance with respect to the specific thing you're trying to predict.
Now, you can get a lot more information about the terms than just the top few important ones. DataRobot shows you degree of importance as a strength each term contributes to the prediction made. In other words, a term might be important with respect to the frequency of occurrence or its specificity to a given context, but it may have no impact on the prediction results. Say we want to know user sentiment on movie reviews. The word movie might be important, but has no indication or relevance to sentiment. To get this data, you need to download it from one or more of our Word Cloud (under Models > Insights>), as linear model coefficients (Models > click on a linear model on the Leaderboard with a β𝑖 icon like Elastic-Net > click on Coefficients), similarly via coefficients show in Text Mining (under Models > Insights>), as a ratings table in a Generalized Additive2 Model, or as a 5th option which allows you to get the top most and least important word coefficients directly from the text modeler itself by similarly downloading coefficients from an "Auto-Tuned Word N-Gram Text Modeler" model. I encourage you to explore them each for their differences. You can get all terms and phrases, perhaps several thousand terms and phrases, that vary depending on which you use.
You can download each by clicking Export on the bottom right on most pages. This downloads as a csv file. You can also access this data from our API. Here's a link to the word cloud: https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.22.1/autodoc/api_reference.html#wor.... Here for coefficients: https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.22.1/entities/model.html#model-para..., or here in a ratings table: https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.22.1/autodoc/api_reference.html#dat.... You would then import this data into Power BI to aggregate it all together.
I have a few thoughts to integrate this term strength and the prediction results back to the source data. The predictions occur one to one to your source - you can simply add the column so that's straight forward. The term strength would be mapped to the key phrase extraction importance with a little data munging. Look up each term in the csv you downloaded from DataRobot and create new columns with the DataRobot strength. Or you could replace the Azure terms with their strength. Or you could group the DataRobot terms strengths into a group of say 1 to 10 and map that encoding back to the dataset. You could them re-run your new dataset back through DataRobot to get even more refined predictions. There are lots of creative options. Or you could not use Azure's extraction importance and you can instead use DataRobot's prediction based importance, though both offer you interesting insights from differing qualities of 'importance'. In any case, you now have the Azure term importance, the DataRobot predictions, the DataRobot term prediction target relationship, and your source file all aggregated in Power BI.
As far as your Hadoop integration, please have a look at this: https://community.datarobot.com/t5/resources/deploying-a-model-to-hadoop/ta-p/5044. And here for data connectors in general: https://community.datarobot.com/t5/resources/importing-data-overview/ta-p/1712. We can connect to any data source in a variety ways - through JDBC, with specific vendor connectors via our connectors to them or theirs to us, with static and dynamic SQL, or just an API if you want to put some application code in the middle. When it comes time to think about what's best for your application (low latency, cheapest overhead, etc), you can assess all options when you're ready. But my personal take during development is to do what's easiest to prove out this integration first, which for me is to code it up and connect via the API.
I hope this helps!
Thank you @doyouevendata
I am trying to create a prototype for Key Phrase extraction using with Local Excel dataset, these phrases i am going to use in my slicers to select the records and work on count of records appropriately.
My actual plan is to utilizing a Key Phrase Extraction functionality in Data Robot by connecting Hadoop / HDFS data sources and visualizing the same in Power BI.
Please let me know how can we proceed with to achieve my requirement.
In regards to Power BI integration, typically this would come in after the model has been developed, and you have some new data to score which you wish to visualize in Power BI.
If you are simply examining a local file, like a csv on your desktop, then using one of the local file scoring approaches with DataRobot is one option; where a python or windows script we provide would take your input.csv and create an output.csv with the results of passing the data through your deployed model.
Alternatively if you are sourcing from a database, a best practice would be a similar approach where the data is pulled from and put back into the database, after which Power BI would simply be querying the database.
I am also curious what sort of use-case you were thinking of creating regarding "text-phrase extraction". If I may ask, what exactly are you trying to predict using the text-phrase?
Hi @Srikanth , I suggest you take a look at the following links regarding how DataRobot uses Text data:
Hope it's helpful to you!
Hi @Srikanth - For guided learning please have a look at https://university.datarobot.com/. There's also the DataRobot Developers portal, https://developers.datarobot.com/, if you're looking for some developer-focused assistance.
I see shyam was able to give pointers for your 1st question, and I'm hoping someone can follow up with help for 2nd question soon! Stay tuned.