Andrea DiMaio

A member of the Gartner Blog Network

Andrea Di Maio
Managing VP
15 years at Gartner
28 years IT industry

Andrea Di Maio is a managing vice president for public sector in Gartner Research, covering government and education. His personal research focus is on digital government strategies strategies, Web 2.0, open government, cloud computing, the business value of IT, smart cities, and the impact of technology on the future of government Read Full Bio

Coverage Areas:

Torturing the Data Long Enough Will Make It Confess Anything

by Andrea Di Maio  |  June 12, 2012  |  18 Comments

Last week Reuters, Financial Times and the Huffington Post referenced a rather sensationalistic outcome published by an Italian entrepreneur and contract university professor who is well known in Italian social media circles. His research allegedly showed that “up to 46 percent of Twitter followers of companies with active profiles could be generated by robots, or bots”.

His method, described in a research paper, uses a point system making the following assumptions:

Characteristics associated with “human” behaviour worth one point:
· The profile contains a name
· The profile contains an image
· The profile contains a physical address
· The profile contains a biography
· The user has at least 30 followers
· The user has been added to a list by other users
· The user has written more than 50 posts
· The user has been geolocalised
· The profile contains a URL
· The user has been included in another user’s favourites
· The user uses punctuation in posts
· The user has used a hashtag in their posts at least once
· The user has used an iPhone to log in to Twitter
· The user has used Android to log in to Twitter
· The user has posted with Foursquare
· The user has posted with Instagram
· The user has used the Twitter.com website
· The user has written the userID of another user inside at least one post
· The user has a number of followers which, if doubled, is greater than the number they are following.
· The user publishes content which does not just contain URLs

Characteristics associated with “human” behaviour worth two points:
· At least one post has been retweeted by other users

Characteristics associated with “bot” behaviour worth one point:
· For each characteristic on the “human” list which has not scored points, one “bot” point will be assigned, with the exception of the following:
- the user has logged in through different clients
- the user uses the website
- the user has used Android
- the user has used iPhone
- the user has posted with Foursquare
- the user has posted with Instagram
· User uses only APIs

If any one characteristic of “human” behaviour is true, the corresponding “human” points will be assigned. If it is false, the corresponding “bot” points are assigned.
Conversely, for each “bot” behaviour characteristic, if it is true, “bot” points will be assigned. If it is false, “human” points are assigned.

The algorithm based on the scoring system above has been run on followers for 13 international companies, and 26 Italian ones, mostly from very different industry sectors.

I guess it is quite obvious that most of these measures are rather arbitrary and debatable measures, and changing the scoring system would be both easy and plausible. For instance, are people who mostly lurk and do not write tweets any less human than those who are compulsive writers and retweeters? Does using a mobile device make somebody more human? Using twitter.com rather than the many tweeter applications for PC and mobile devices make somebody more human?

Also, the sample is hardly representative.

Given all this, it is somewhat remarkable that this research got as much exposure as it did, for which one clearly must give the credit to the professor-entrepreneur’s marketing skills. On the other hand, it proves what a former colleague of mine used to say: if you torture the data long enough, they will confess anything.

After all, in a blog post, the author says:

Now I wish that somebody will make the effort of re-processing the data I provided in my research, changing both method and algorithm, in order to get different results (with bigger or smaller numbers, it does not matter) My research has no presumption, unless opening a gate that – I hope – will be crossed. and possibly better, by others.

Which is like saying that he is not entirely sure about what he published. While this supports the point about how data can be tortured, also casts a shadow on what people who hold academic positions in this country consider as research.

18 Comments »

Category: social networks in government web 2.0 in government     Tags:

18 responses so far ↓

  • 1 Torturing the Data Long Enough Will Make Them Confess Anything | felicevitulano   June 12, 2012 at 2:39 am

    [...] on blogs.gartner.com Share this:TwitterFacebookLike this:Mi piaceBe the first to like this [...]

  • 2 Paul Masson   June 12, 2012 at 11:22 am

    Andrea, I agree with your assessment of the ‘research’ of twitter followers. However, I am confident that much like click fraud, twitter fraud is a real issue.

  • 3 Giorgio Sironi   June 13, 2012 at 11:30 am

    Indeed – when the “research” was posted on Indigeni Digitali I commented that if I had written it, I would have got a failing grade in my exams in engineering.
    By the way, the university this “research” comes from a private university in communication sciences, and as the old saying goes anything that calls itself a Science, probably isn’t…

  • 4 Mi prendo la briga » dotcoma   June 13, 2012 at 1:17 pm

    [...] Arrivano anche le critiche, dopo i repost in mezzo mondo. 10 giugno 2012 — 22 [...]

  • 5 Francesco   June 13, 2012 at 1:17 pm

    The link thath point to research paper is corrupted.

  • 6 Marco Camisani Calzolari   June 13, 2012 at 1:20 pm

    I’m what you define professor-entrepreneur.

    Don’t you think that is it strange that all this “lurker” and “inactive” users are following only companies in the research sample?

    I didn’t tortured any datas. Everything in my research is open and available here http://www.camisanicalzolari.com/MCC-Twitter-ENG.pdf

    Before to put any shadow on my work, have you tried by yourself to check if results are true or not?

    Please answer…

    How can you, in your position, under the Gartner logo, say something that you havn’t verified?

    BR
    mcc

  • 7 Andrea Di Maio   June 13, 2012 at 2:22 pm

    @Francesco – thanks for pointing to the link problem – it is fixed now

  • 8 Andrea Di Maio   June 13, 2012 at 2:29 pm

    @MCC – Thanks for your comment. I have indeed read and referenced your paper, and quoted the algorithm you use. My criticism is entirely objective and based on that algorythm, as clearly stated in my post. To better explain the metaphor, the torture comes from the algorythm itself.
    As clearly indicated in our blog policy, these are my personal opinions. However I express them in my analyst capacity and often warn clients and non-clients about exercising caution when looking at surveys of any sort.
    In the past I have been pointing out flows on much larger scale surveys, such as the one run by the EU to assess e-government progress of member states. I assume this puts you in very good company.
    Finally, I am not sure which sort of verification you are referring to: once the algorythm is flawed, the data collection exercise becomes less relevant. However I may have misinterpreted your point: in this case, it may help if you could make it a bit clearer for me.

  • 9 Marco Camisani Calzolari   June 13, 2012 at 3:22 pm

    I didn’t got you point. You say that my algorithm is not good for you. Why do you say that? Afert writing a critic post like yours, I assume that you can provide me a detailed list of technical and specific reasons why you don’t like it. Which parameter do you think is wrong? You listed lot of them… tell me which. Or maybe is a value given to it?
    I hope you carefully read my paper before to write this post. So please let me know what’s wrong in specific.
    “Verification” in the digital communication is not a clear concept. For example I can’t get the phone numer of each of “users” in the sample. How can I verifiy if they are human or BOT? It’s not possible. That the starting problem and that’s the reason why I did this research. I can’t verify it, I can only say, based on my algorithm, that probably they act as BOTs.
    There wasn’t any other research that evaluates this aspect, and that the reason I make mine….
    Maybe it’s far to be perfect, but it’s a starting point, well documented, from where other researched can start for a better analysis…
    Take care
    Marco

    P.S.
    you didn’t answer my question:
    “Before to put any shadow on my work, have you tried by yourself to check if results are true or not?”

  • 10 Marco Camisani Calzolari   June 13, 2012 at 3:23 pm

    And I forgot to say… It’s not a good think to put the wrong link after a critic like yours…

    anyway now it’s fixed…

  • 11 Massimo Moruzzi   June 13, 2012 at 4:00 pm

    >“Before to put any shadow on my work, have you tried by yourself to check if results are true or not?”

    That’s what YOU should have done, Marco!

    IMHO you should have checked if your own theory and algorithm made sense or not on a small sample (5%? 10%?) of the companies in your study, by checking by hand, follower by follower, if the number of “fake” followers you algorithm predicted was more or less true for those companies – and only if it made sense then extend your “results” to all the other companies.

    But you didn’t. So you don’t have in fact any “results” to speak of.

    Just an unchecked theory.

    My personal theory is that all twitter users with a username which starts with a vowel are fake.

    How do we judge if your “theory” is better than mine?

    We can’t.

    Unless and until we check our respective theories by going over a number of followers one by one by hand, and then see if your theory or my theory provides a better guess as to which followers are fake.

  • 12 Andrea Di Maio   June 13, 2012 at 5:49 pm

    @MCC – Apologies for the wrong link, but the research paper was reachable from the FT post, where I found it. Glad it’s fixed now. As far as my critique, I thought I had written it in the post: the scoring systems seems arbitrary to me. Actually I know many people who do not want to have their picture on either Twitter or Facebook, nor are they anybody’s favorite although they may have quite a few followers, who do not write much at all and use Twitter mostly as a feed mechanism, and so on. Actually almost any of your criteria could be challenged.
    I do appreciate you gave it a try, but I think it is fair to say the approach is far from being bulletproof.
    Now, it is always nice to be quoted in the press, but you seem to have used this as an argument in other related conversations to prove that your approach is worthwhile. Should a Gartner client ask me, after reading those references, whether he or she should base investment or social media strategic decisions on your research, it would be my duty to point to its flaws. So, as you can see, I would have no issue at all in writing this not as a blog post but as an actual Gartner research note.

  • 13 Walter Vannini   June 14, 2012 at 3:56 am

    Dear Di Maio,
    thank you for pointing out that, this being the blog section, what is written here does not imply any official Gartner endorsement.
    Frankly, after reading your piece I had the same kind of doubts regarding Gartner that you expressed regarding the Italian academy.

    I appreciated the conciseness of your post, so I will quickly make my point.

    Your open assumption “tortured data will say anything” is trivially true.
    Your impled conclusion, i.e. that Camisani’s work is an example of those tortured-data researches, is instead a non sequitur.

    I understand your objections to be as follows:
    1) “these measures are rather arbitrary and debatable measures, and changing the scoring system would be both easy and plausible”

    2) “the sample is hardly representative”.

    both points hold for most or all market and social science research, where the goal is exactly to build a consensus on:
    a) what should be measured
    b) how it should be measured
    c) what sample-size can be considered significant.

    So, your objections so far do not disprove the research but rather states it cannot be conclusive. Which is exactly what Camisani maintains.

    Coming to data-torturing, I see no evidence of such in the study. No cherry-picking, for instance. No creative use of “outliers”. No retrofitting of tehory to data.
    The same could not as easily be said of many other studies on Social Networks market value, or advertising ROI, to mention only two topics.

    The little research seems to point out that some well-known and well-sold measures of SN value are bogus. We all know those measures are worth serious money for many in marketing, so I would say the scandal and uproar are no surprise.

    No critic so far seems able convincingly to prove the research as unfounded. And none is likely to, as the consensus in the industry is that such practices as bot-trading are well in use.

    We are losing sight of the main goal here: client money is being wasted in useless and potentially damaging practices. This has meant easy profit so far for some, but the genie is now out of the bottle.

    Let’s own up, the industry should show real commitment to client interests.This means publicly disowning these practices altogether and start debating on what criteria, measures and samples should be used to assess effective SN reach.

    Shooting the messenger and “business as usual” is less costly, but a losing strategy in the long run.

    Followers, likes and all counting metrics are no different from click, pageviews, impressions and the like. Easy to measure, easier to fake.

    Bots are everywhere now, and will be more and more. The industry has only to lose from sticking with them.
    Even in marketing, client investments should be treated with respect rather than wasted.

  • 14 Finti follower e finti tonti 3: shooting the messenger | Mindspa blog   June 14, 2012 at 6:34 am

    [...] Antonio Di Maio dice che i criteri sono opinabili e il campione non significativo [...]

  • 15 Andrea Di Maio   June 14, 2012 at 6:59 am

    @Walter – Just to state more clearly what I said in response to MCC earlier: I would have no issue in writing the same exact conclusions as a Gartner research note for our clients: simply I do not believe MCC’s research and press exposure deserves that much attention.

    You say “your objections so far do not disprove the research but rather states it cannot be conclusive”. I would argue that challenging several criteria used in the scoring system is enough to disprove the method, hence the research.

    You say “The little research seems to point out that some well-known and well-sold measures of SN value are bogus”. I agree many feel the same way, but introducing arbitrary and easy-to-challenge alternative measures does not help the cause.

    You say “No critic so far seems able convincingly to prove the research as unfounded”. I am not sure we are reading the same criticisms, since both criteria and methods have been challenged. Of course none of us believes the motivation for the research is unfounded. But we all seem to agree that MCC’s approach does not help (i.e. it does not offer better measures than those used so far).

    You say “We are losing sight of the main goal here: client money is being wasted in useless and potentially damaging practices. This has meant easy profit so far for some, but the genie is now out of the bottle”. I wholeheartedly agree, which is why I felt it is important to point out when additional waste is possible. It is easier to decide “I do not invest on Twitter because I feel there are lots of fake or inactive followers” than taking MCC’s research as the basis for such decision.

    You say “Let’s own up, the industry should show real commitment to client interests.This means publicly disowning these practices altogether and start debating on what criteria, measures and samples should be used to assess effective SN reach.”. I am glad to say that we at Gartner try to help our clients do that all the time. If you read my blog, you will find out that I spend a considerable amount of time challenging the need for enterprise social media presence.

    You say “Shooting the messenger and “business as usual” is less costly, but a losing strategy in the long run.”. It seems to me that MCC was more a messenger of his own ideas and reputation than of any demonstrable theory.

  • 16 Walter Vannini   June 14, 2012 at 9:18 am

    @Andrea,
    thank you for your reply.

    So, you are not saying that it’s ridiculous or futile to investigate about the share of bots among followers. You say criteria are arbitrary but without pointing out what would not be, and that the paper offers no better results than are already available.
    Hm, so much for challenging the work.

    I am glad to hear you challenge corporate presence in the so-called social media, that puts us in the same team. I am the guy who refers to SNs as “parco buoi” (corrals), nice to meet you.

    Finally, You say “MCC was more a messenger of his own ideas and reputation than of any demonstrable theory”; well, aren’t all corporate white papers susceptible to exactly the same criticism? The point should be whether they tell something of value to the client or not.
    I maintain that MCC’s conclusion (that follower number alone is not a sound criterion to measure anything) does provide value, in suggesting that clients should be extremely more demanding in evaluating costs and returns of social-network marketing campaigns.

  • 17 Andrea Di Maio   June 14, 2012 at 9:28 am

    @Walter – I do not think I have suggested that MCC’s criteria are flawed because there is a different set that already works. What I said is that his criteria and method are flawed because they are arbitrary. Anybody could have come up with a diifferent and equally arbitrary set of criteria, which is why I used the term “torturing the data”.
    If MCC’s conclusion is that “follower number alone is not a sound criterion to measure anything”, I believe this has been well known and understood for quite some time.
    Now, let’s assume MCC’s algorythm worked and proved that on average X% of followers are fake or inactive: why should this change how companies invest in SM campaigns? They would know that only (100-X)% of their followers are real or useful ones, and this may be more than enough.
    Where research is really needed is on understanding how followers’ numbers and (more importantly) engagement behaviors change according to which parameters of a SM campaign. But MCC’s research does not even scratch the surface here.

  • 18 Quick Note: Twitter, Bots and Data Torture « todaysnote   June 23, 2012 at 12:42 am

    [...] prove if you try hard enough. Dr. Andrea Di Maio, of Gartner, published a blog post titled “Torturing the Data Long Enough Will Make It Confess Anything” on the Gartner blog [...]