DataRobot Paxata has a tool named 'Predict tool' on the tool panel along side other tools such as Joins, Aggregates, Remove Rows etc. At any point in the Data Prep Project you can add the Predict tool to your Project steps - you will be required to provide your DataRobot API token and with just that the tool will fetch all Deployments from DataRobot along with a desc of the Deployment. You can choose a Deployment from the list and tell the tool if you need prediction explanations returned along with the scores. And that's pretty much all you need to do. You will see the prediction scores come back into the DataPrep Project into the rows of data - at this point you can proceed to add additional Data Prep steps on the data that includes the prediction scores and explanations. You can also spin up Filtergrams (interactive histograms) to explore the prediction scores alongside other columns in the data.
Great question. But, my question will be 'why not?'... if you can explore your entire dataset and derive insights instantly, why would you want to be limited to samples?. This is especially helpful if your data has anomalies or characteristics that may potentially be missed in the sample.
Also the key difference between workflow driven data preparation and data driven data preparation exercises is that in the former, the requirements or logic guide your work and in the latter your data prep steps are guided by the actual data. If that's the case, then it is helpful to be guided by the entire data as opposed to being led by a sample.