Earlier today I dared to tweet my disagreement about a piece that Tim O’Reilly indicated as really important one. Tim tweeted back and so I guess I owe him a better explanation than what I could squeeze into a couple of tweets.
Let me start saying that the piece I criticized – Truly Open Data by Nat Tokington – is a good one, as it makes a number of interesting points about how open source software approaches. Nat highlights very clearly the problem of data quality that will affect many datasets and may indeed have a negative impact on open government initiatives. His intuition, which sounds quite intriguing, is that one could apply the best thinking and practices from open source to open data.
However there is a fundamental flaw in this line of thought. Open source projects cluster a number of developers who collaborate on an equal footing to develop a product they are jointly responsible for, as a community.
Government does not have the luxury of doing so. An agency publishing crime statistics or weather forecast or traffic information is ultimately accountable for what it publishes.
Indeed collaboration can be built around that data, and several mechanisms used in open source projects can be effectively leveraged, such as mailing lists, bug trackers, ways to report problems and inaccuracies, in a nutshell mechanisms that would help those who are charged in that government agency for quality and accuracy of that data do a better job.
I would also argue that it is quite idealistic to think of open government data users having the same degree of tolerance that users of open source software would have. There is an expectation that government does the right thing and provides trusted and accurate dataset.
Where the author hits the nail on the head is where he says that
we need to change attitudes and social systems. Data is produced as the product of work done, and is rarely conceived of as having a life outside the original work that produced it. Some datasets will (some won’t–think of how many projects fail to interest anyone but the person who started them). This means thinking of yourself not just as the person who does the work, but the person who leads a project of interested outsiders and (in some cases) collaborators and who is building something that will last beyond their time
However this is not the way open government works today. To realize this vision, governments need to overcome the asymmetry that I’ve highlighted several times (see here and here) and give the same dignity to citizen-collected data as to government-sanctioned data. Only then we can start thinking about open source communities around open data: some communities may led by government, some by external stakeholders.
But all this raises a number of questions about who is accountable for what and the fine like between trust and truth, which I addressed in a recent post.
This is why, while I am intrigued by Nat’s proposal, I cannot buy it at this stage. If we want to pursue that path, governments do not only need to open their data, but the process they use to collect, qualify, manage and publish it. I am not sure they are ready yet.