I'm currently experimenting with BigQuery as Data Source. I do that by clicking on the person icon -> "Data Connections" -> "Add new data connection" and then fill out the OAuth-credentials (OAuthType stays at 2 which seems to be the default value). But when clicking on "Test connection" then all I get is an empty popup-window with title "Test Data Connection" and no confirmation on whether I've filled in the fields correctly.
I've also tried creating a new project with the newly created data source but no BigQuery datasets are shown. The only time where I do get a response from DataRobot is when I'm in "Credentials Management" -> "Add new (Credential)" and then associate the newly created data connection to the credential. Once done a plug/socket-icon appears in the list, and when I click on the icon then it changes to a green light, suggesting that testing connection works. What am I missing in order to see my data in BigQuery when creating a project?
Also, are you using Simba driver behind the scene? I'm asking because from their documentation it seems like one can provide a service account key-file. With the OAuth credentials it seems like we'd have to create an actual user in Google Cloud since OAuthRefreshToken isn't available for service accounts.
I'm using Chrome without any adblocker. Chrome network debugger is blank when clicking on the "Test connection"-button.
Solved! Go to Solution.
I think I can help with these questions:
Please let me know if this is helpful and if you try again, if you're successful in connecting to BigQuery.
I just tried and it's working now so thanks for enabling the feature. I still have a couple of questions though.
1) This time, when clicking on "Test connection" I see a button named "Sign in using Google" - I guess this is the enabled feature. But Google is warning against signing in from DataRobot in a popup-window (expanded Advanced button):
Google hasn’t verified this app
The app is requesting access to sensitive info in your Google Account. Until the developer (email@example.com) verifies this app with Google, you shouldn't use it.
Continue only if you understand the risks and trust the developer (firstname.lastname@example.org).
Go to datarobot.com (unsafe)
By clicking "Go to datarobot.com (unsafe)" I'm allowed to the next page which says something like "datarobot.com wants access to your Google account" and asks permission to view BigQuery-data and managing data in Google Cloud (all that is fair enough). I am unsure if this app verification is something that should be done within our Google Cloud-project or if it's something that you - DataRobot - should do? It would increase trust between the two websites.
2) I've tried creating a couple of projects and I can now see my BigQuery tables in DataRobot. But how is table size determined on your side? I've selected a few tables which I believe are way below the standard CSV-file size limit of 2 GB yet they are rejected due to table exceeding file size, see below.
|TableName||BigQuery __TABLES__.size_bytes*||Physical file size||DataRobot size in bytes||Result|
|random_table_1||1.593.548.791||1.244.365.169 bytes||2287 MB||Error: "Dataset with size 2287 MB exceeds the download limit of 2048 MB."|
|random_table_2||1.585.121.517||N/A||2075 MB||Error: "Dataset with size 2075 MB exceeds the download limit of 2048 MB."|
*: __TABLES__ is a BigQuery metadata table.