As a Gartner analyst I started covering open source software in government quite a few years ago. Those were the times when the city of Munich had taken its historical decision to switch to an open source desktop and when some Spanish, French, Scandinavian, South American government organizations and entire jurisdictions declared “war on commercial software”. At the time, I was trying to make people think about the differences between open and free and between open standards and open source. In doing so, I had to make them reflect about how adopting an open source solution should undergo the same due diligence in terms of “value for money” as any other software selection.
Although today many government open source policies seem to be going in that direction and encouraging people to consider open source alternatives and compare them to proprietary ones, at that time my statements were seen as supportive of commercial software vendors. Ironically those vendors were often mad at me too because I was injecting some rational thinking about open source in their client base, while it was easier for them to fight a “religious” battle on a more political ground.
The recent debate on the relationships between open source, open data and open government reminds me of some of those dynamics.
As I’ve dared highlighting the limitations (or – better – boundaries) of an open source development process applied to open data, respected thought leaders in the field have thrown their darts at me, sometimes being dismissive about how much I do understand about open source. No offense taken, but I believe it would be mutually beneficial to clarify what “open” means in different contexts.
Let’s start with what open source software is. It is a licensing model and not a development model. The OSI definition of open source license provides ten basic principles, most of which do apply to open data too, although I have not seen anybody doing that very mapping (which would be an interesting exercise for a separate post).
An open source development process is something else (see for example the Eclipse development process). There are plenty of examples where software that is not licensed through an open source license uses an open source development process: even large vendors like Microsoft do.
So when people like Nat Tokington or Gunnar Hellekson put together open data and open source development process, I am confused. I can have open data without an open source development process, and the other way around. As I said before, building a “development” community around data makes sense, but there are important accountability boundaries that impose limits to how open that community can be and how soon.
I was particularly pleased with Gunnar’s comments. In his post he said:
If you look at the outstanding work of pro-transparency organizations like the Sunlight Foundation, govtrack.us, RECAP, and others, nearly all are using open source and the open source development model. It’s not, as DiMaio and Caudill suggest, because they’re naive ideologues who are confused as to the meaning of “open”. These are smart people doing serious work. They’re using open source because it’s the best way to collect a large number of contributors around a common problem. They’re using open source because the transparency of the process and software makes their work credible. They’re using open source because they believe that free access to government data means free access to the tools that make that data useful.
Absolutely, I do not disagree at all, nor do I believe that either I or Bob Caudill used terms like “naive ideologues”. On the contrary, I believe that open source proponents who are actively lobbying for the use of open source tools are business people who are honestly trying to help their clients while increasing their own revenues. Is there anything wrong with that? Not at all.
However let me point out that all these countless examples of open source and/or transparent processes that I keep hearing about have one interesting characteristic in common: they happen outside the boundaries of government. Sunlight , RECAP and others are all organizations and endeavors that do not originate nor formally relate to government.
Now, if government had the luxury of outsourcing (or should I rather say crowsdourcing?) the open data collection, cleansing, publication process to third parties together with the accountability for this, then I am sure we would see hundreds of open source communities popping up. But until when that accountability stays with government, the process will have to have boundaries. This is why I said that community source, i.e. the use of an open source development approach restricted to a small and controlled community (which can expand over time), may be a closer goal.
Finally, should we conclude that having government in charge of providing open data is a limitation and the relevant processes should be taken off its hands as soon as possible? Well, I’d like to think that – unlike volunteers or businesses, who do what is valuable or compelling to them – government still has an obligation to manage and provide data irrespective of its immediate commercial or social value. So I’d rather have more public data managed through an imperfectly open (or blatantly closed) process for which I can point a finger to somebody, than have fewer, higher quality and transparently managed data, which nobody really is accountable for.