The research note, Emerging Role of the Data Scientist and the Art of Data Science, I authored with colleague Lisa Kart just hit the Gartner wires this week. Since most of the data scientist role dissenters we come across seem to believe that the role’s title is is nothing more than a pretentious moniker for a statistician or business intelligence (BI) analyst, we decided to take an…er…scientific approach to making that determination. We thought it would be entirely fitting to perform text analysis of hundreds of job descriptions for “data scientist,” “statistician,” and “BI analyst” to learn what the commonalities and differences are according to those actually hiring for the the role.
I’d like to believe that these findings led us to more clearly define and distinguish the role of the data scientist, without speculation, than anyone else to-date. Through our research we learned that data scientists are expected to work more in teams, have a comfort and experience with “big data” sets, and are skilled at communication. They also frequently require experience in machine learning, computing and algorithms, and are required to have a PhD nearly twice as often as statisticians. Even the technology requirements for each role differed, with data scientist job descriptions more frequently mentioning Hadoop, Pig, Python and Java among others.
The piece then goes on to define and describe the three core data science skills: data management, analytics modeling and business analysis. But beyond these, there’s an art to data science. We detail several soft skills that our research showed are also critical to success, i.e., communication, collaboration, leadership, creativity, discipline and passion (for information and truth).
With the need for data scientists growing at about 3x those for statisticians and BI analysts, and an anticipated 100,000+ person analytic talent shortage through 2020, we also included a listing of university programs around the world offering degrees in advanced analytics.
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.