Blog post

Google Public Data: Threat or Opportunity for Data.gov?

By Andrea Di Maio | April 30, 2009 | 2 Comments

social networks in government

On April 28th Google’s blog announced Google Public Data, a new search feature to make easier the retrieval and comparison of public data published on government web sites, by using data extracted from official sources and by visualizing them with Trendalyzer.

As their post says

The data we’re including in this first launch represents just a small fraction of all the interesting public data available on the web. There are statistics for prices of cookies, CO2 emissions, asthma frequency, high school graduation rates, bakers’ salaries, number of wildfires, and the list goes on.

In this first instance, Google shows the trend of unemployment data, with data coming from the U.S: Bureau of Labor Statistics and the U.S. Census Bureau, but suggests that more will come.

Where does this leave data.gov, the repository of government data in multiple formats that the US Federal CIO Vivek Kundra wants to put in place? If and when this will be up and running, tools like Google Public Data may help combine and visualize data. But what is most intriguing is that Google can already extract data from where they are, i.e. buried into individual agency web sites that it routinely searches and indexes.:

Governments cannot rely only on Google, and building some form of common repository or common access layer to data makes a lot of sense. However the appetite for more accessible public data that are easy to consume and understand is growing everywhere, and there may be a case for Google and others to provide more functionalities in this space, potentially making government initiatives such as data.gov much less relevant if not timely enough.

Comments are closed

2 Comments

  • Or alternatively, DATA.GOV could resolve into multiple external providers that actually host COPIES of the data (rather than going back to the original source), that in turn could be accessed by Google and others for cool graphics and other analysis work. Why should the US Gov’t host ANY of this for direct access — why not just distribute the data regularly to commercial and academic sites (like say JPL does for those huge NASA satellite photos of Saturn and so on).

    The only real worry would be that the data set would be CHANGED in the copy, but in the end greater access will trump this risk factor IMHO. You could always do a final run of your analysis against the OFFICIAL site as a final check to be sure this problem didn’t happen to you.

  • Christian says:

    In a related topic, there is something to be said for the collective intelligence of web searches. Google’s data on the swine-flu led the data from the CDC by two weeks:

    http://www.gauravbhalla.com/2009/04/collective-intelligence-of-web-search-logs.htm