John Foreman is a chief data scientist at MailChimp and has done a lot of analytic work for large companies. He argues that a skilled data scientist’s work will cost more than $30 per hour.
|The $30/hr Data Scientist|
Yesterday a journalist asked me to comment on Vincent Granville’s post about the $30/hr data scientist for hire on Elance. What started as a quick reply in an email, spiraled a bit, so I figured I’d post the entire reply here to get your thoughts in the comments.When we ask the question, “Can someone do what a data scientist does for $30/hr?” we first need to answer the question, “What does a data scientist do?” And there are a multitude of answers to that question.
If by data scientist, we mean ” a person who can perform a data summary, aggregation or modeling task that has been well-defined for them in advance” then it is by no means a surprise that there are folks who can do this at a $30/hr price point. Indeed, there’ll probably come a day where that task can be completed for free by software without the freelancer. This is similar to the evolution of web development freelancing.The key phrase though is “task that has been well-defined.”
The types of data scientists who command large salaries seem to meet two very different definitions than what a freelancer at $30/hr can meet:
1) There’s the highly-technical engineer. Someone who is knowledgeable and skilled enough to select the correct tools and infrastructure in the polluted big-data landscape to solve a specific, highly-technical data problem. Often these folks are working on problems that haven’t been solved before or if they have there are only a few poorly documented examples. Because these tasks might not even be solvable, they’re certainly not “well-defined.” A business wouldn’t trust important bits of infrastructure to $30/hr.
2) There’s the data scientist as communicator/translator. This person is someone who knows data science techniques intimately but whose strength is actually in the nontechnical — this person thrives on taking an ambiguous business situation and distilling it into a data science solution. Often managers and executives don’t know what’s possible. They know what problems they have, but they don’t know how or even if data science can solve those problems. These folks can’t hire someone halfway across the globe at $30/hr to figure that out for them. No, they need someone who’s deeply technical but also deeply personable in the office to talk things through with them and guide them.
All of the hype around data science is generating a lot of these articles about automating or replacing the role. But
I think it’s important to realize that just like “doctor,” “lawyer,” “consultant,” “developer,” etc. the “data scientist” is more of a spectrum or category than a single role.A data scientist is not someone putting doors on an automobile in a factory. Some of them might be doing just that, i.e. rote modeling tasks. But not all of them. I believe that MOOCs will excel at teaching up an army of these lower-paid data scientists. And that’s great. They’ll fill a need. Kinda like the need in the 90s for people with basic COMPTIA certifications and the most basic of Cisco certs.
However, there will always be a place for those who excel at solving ambiguous technological & business problems. And they’ll cost more than $30/hr.
With permission from John Foreman, original post can be found on here on his blog.