Blog post

Data Scientist – Mystified

By Svetlana Sicular | June 29, 2012 | 7 Comments

skillsInquire Withininnovationdata paprazziData and Analytics Strategiescloud"Data Scientist"

Companies are desperately seeking mysterious creatures — data scientists. Some people claim to have seen them in LinkedIn and Target. Perhaps, those were encounters with data scientists from LinkedIn that shop at Target? Or Target data scientists who search on LinkedIn for pregnant teens? Either way, the companies are desperate (except for LinkedIn and Target). But they are seeking anyway. Why? Because nowadays, everyone wants to compete in the new, data-driven economy, where Google and Amazon have already figured out “data alchemy” — turning data into gold.

A data scientist symbolizes to organizations a gaping hole: a magic that can turn big data into big gold by making sense of vast amounts and multiplicity of senseless bits and bytes (or zettabits and petabytes?). The data scientist is a savior who (if found) can solve all big data problems, so companies will not have to worry about figuring out how to do it themselves, all they need is to catch two or three really good data scientists, no matter what they are.

I heard a couple of definitions: a data scientist is 1) a data analyst in California or 2) a statistician under 35. Either make 10% above the salary of common data analysts and statisticians, so the latter learn how to position themselves as data scientists. Google shows that web search interest for “data scientist”picked up back in 2010, but the #1 Google search phrase on the subject is “data scientist salary”, which reflects both supply and demand. The second top search is “data scientist jobs”. I confess, I did it too: I copied around two dozens job postings with removed titles and other HR nomenclatures and made a word cloud.

Data scientists are everywhere. What about data paprazzi, data janitors and data spys?
This word cloud is made of the recent job postings for data scientists.

The picture, as well as my more in-depth research, show that companies should look within. Organizations already have people who know their own data better than mystical data scientists — this is a key. The internal people already gained experience and ability to model, research and analyze. Learning Hadoop is easier than learning the company’s business. What is left? To form a strong team of technology and business experts and supportive management who creates a safe environment for innovation. The team members with diverse skills will inspire and enrich each other: their combined knowledge will be the power to develop analysis and bring new insights.

After the team achieves results, compare the size of data with the size of science on the picture. By the way, did you notice that large is greater than big but both are relatively insignificant?

The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.

Comments are closed

7 Comments

  • Sveta, what soft tool was used by you to make this word cloud?

  • Miguel de Andrade says:

    The challenge for the ‘data scientist’ as you probably allude is to make sense of the randomness. Pretty much like mankind course to structure chaos into an intelligible pattern or ‘insights’.

    The challenge for Organizations is to structure and establish a common business language that propagates into the data being created by the business. How many times it is not the case?! Reason for Google to be so successful.

    • Svetlana Sicular says:

      Hi Miguel,

      I agree with you on the common business language challenge for organizations. I also like your philosophical view on structuring the chaos. But I also think that the challenge cannot be resolved by a data scientist: this is a challenge for organizations too – to realize that they need a whole team of people who can correctly formulate a question, understand the data needed to answer the question, create a solution, validate the solution and, what’s especially important, to act upon the resulting ‘insights’. The role of a data scientist, or a statistician, or a business analyst here is just a fraction of the whole big data solution.

  • I like that fact that “analytics” and “analysis” are so big and that they are significantly larger than “big.” I just tweeted this week that I’m glad to see the conversation shiftting away from big data and towards big analytics. I’m seeing that companies that are using analytic platforms in conjunction with Hadoop are getting more analytic value from the data. In addition, they are learning how to turn common SQL users into data scientists by giving them access to a library full of advanced functions. It is the business analyst, who understands the business, that can quickly learn the basics of Analytics 101 and drive real value out of the data. I wonder if we should just call them “business scientists”?

  • Michael Morse says:

    Svetlana,

    We’ve been a data driven society for over the past 25 years, it’s just with the advent of “Cloud” computing that the word scientist has been applied. As Miguel states, the difficulty is not finding people to fill these positions, rather it’s getting multiple LOB’s to agree to common definitions and processes that don’t completely match “their” definition, even though the outcome would be to their benefit. I’ve done a lot of work on this area for a large portion of my career, and I can tell you while it keeps me gainfully employed it also gives me many great hurdles to overcome.

  • Amelia Mango says:

    This is part of the reason we’ve decided to conduct a survey of Data Science professionals–we’re hoping to better understand the overlaps in education, experience, and expertise most common among data professionals. Until a more standard definition of a “data scientist” role emerges, this will hopefully provide a more generalized idea of what the role is all about.