What are your thoughts on DataRobot and other automated data
I'm curious what this community thinks about it. My employer is about to sign a contract with them and my boss, who is very technical, has already stated that we planned to hire another data scientist until he heard about DataRobot—so that's at least one data scientist out of a job because of this platform. I'm not saying it's a bad thing; progress is good. But what do you think are the implications of these systems? Is this great for our field, as it picks off the low-hanging fruit, leaving us to work on more sophisticated problems?
I work as a CFDS for DataRobot. I was a data scientist for 5 years before coming to DataRobot. I primarily created models in R.
DataRobot's primary objective is not to put anyone out of work. The goal is to make those doing the work able to do data science at the speed that is necessary for solving important problems. A lot of data scientists can create a good model using traditional tools, but it can take a long time and is only a piece of the puzzle. You also need to deploy, monitor and retrain when your data drifts.
Buying DataRobot may be replacing one full time position now, but it will also allow your team to provide actionable models when they are needed at scale - enhancing your impact and growing your influence in the business - and hopefully your team once you've had a lot of success..
If you think about data science before packages like scikit learn were available, people had to know how to code the models from scratch. Do you think the addition of these packages have helped or hindered the field? I would say they have helped tremendously. Automation in my mind is just the next necessary evolution of that kind of progress. There is no shortage of data, no shortage of problems to solve, and as a field we should be focusing less on code/syntax and more on building solutions that work.
Besides, all of the cloud giants are already trying to do this. Automated ML is here, do you want to use a cookie cutter tool made to meet bare minimum standards for generic DS problems, or do you want to use something truly amazing and push the limits of what kinds of problems you can solve?
I think your point of being able to focus on more complex problems hits the nail on the head. I have worked in BI for many years, and have seen something very similar happen in the last 10/20 years. There is a lot more self-service and tools that make it easier to deal with enormous volumes and complexity of data, something that would have needed expertise to deal with many years ago.
Rather than put data analysts out of work, the area expanded. End users, who now had more product in front of them, now had more questions as they got more exposure to concepts. So many ordinary business users know what a sankey chart is, or be able to interpret confidence intervals.
I think something similar will happen with data science. It will be offered more widely and become more easily available. Companies that are competing without will lose a competitive edge and will need it. They will still require someone to oversee it and will ask for more and more complex problems to be solved.
I wouldn't worry too much about that data scientist - even in the current climate there is lots of demand on the job board!