process images and documents

Linear Actuator

process images and documents

What are the recommendation or process to ingest images, OCR or documents? How can Paxata be leveraged? Any reference documents will be helpful.

Labels (1)
1 Reply

Hi @BJ,

When using DataRobot Data Prep/Paxata for Image data there are two ways to ingest it described here in the documentation full description.


Put shortly, if you have more information then just images (and their classes) use a CSV file to containing the directory of the images their classes and supporting information on each row. If you just have image data, split the images by class in separate folders. 


Besides images Data Robot accepted formats: .csv, .tsv, .dsv, .xls, .xlsx, .sas7bdat, .geojson, .gz, .bz2, .tar, .tgz and .zip, ingesting all of them is drag and drop as long as they are structured Relationally.


Other useful Links:

DataRobot University Course on Data Prep

DataRobot Data Prep Documentation 


If this post answers your question feel free at accept as a solution to help others find the information.

All the best,


0 Kudos