Open Data – the pros, the cons, a bit of FoI and a personal manifesto 3


A couple of weeks back when I was at W3G and GeoCommunity there was a lot of discussion about Open Data (not sure whether there should be a space between the words or not or whether someone has a (c) to the phrase by now). I sensed that there were two very different perspectives on Open Data and to some degree these views were categorised by whether you were at W3G or GeoCommunity but then again I and several others were at both, perhaps we are schizoid. On the morning of the first day of GeoCommunity I blurted out this tweet

Some of my twitter followers who were not at the event asked me to explain so here is my spin on the pros and cons of open data.

Open Door - thanks to h.koppdelaney

A broad based grouping (including the Free Our Data campaign, the Open Knowledge Foundation, My Society, some inspired people within the Cabinet Office and elsewhere in government and many more) have fought hard to convince government to open up non personal public data. They won a great victory on Gordon’s Day (the big announcement in November 2009) which the current government have been very enthusiastic in continuing to drive forward.

Organisations across the public sector have been filling data.gov.uk and the London Datastore with masses of data including a lot of expenditure information, transport, health and environmental stuff (you have to go and browse these sites to appreciate the wide range of data available). Now the presumption for Central and Local Government is that data that is non personal should be published, a simplification but I believe a fair one, and that publication should be in a machine readable form (more on that in a bit). Much of this data has a location context which has encouraged the mappy’s to create some brilliant and some awful mapplications, I outlined some of my reservations on these early poster children of the Open Data movement in my talk at W3G.

The advocates of Open Data suggest that it will increase transparency and accountability of government and encourage an informed discussion about policy and potential service delivery models. They suggest that opening up government information to the development community (because much of the data published will be of limited use to the individual citizen in its raw form) will encourage new social and commercial enterprises, create economic and community benefit and perhaps displace the need for government to run some of the myriad of confusing web sites currently on offer. I have paraphrased and simplified greatly here.

Examples of MP’s expenses are frequently quoted and I think it is fair to say that if all MP expenses had been routinely published and been subject to public scrutiny there would not have been the same level of “abuse” that was revealed when the Telegraph got hold of a leaked set of expense claims. It should be noted (IMHO) that many of the real gems within the data being released are not the expenditure data but the data that allows us to assess policy and performance of the key functions of government. Few would argue against the proposition that Open Data will encourage more transparency and accountability in government, most would agree that transparency and accountability should strengthen participative democracy in our country.

You might be wondering who at W3G or GeoCommunity would be opposed to that? My answer is that no one was actually opposing Open Data but I felt there were two different groups of opinion that were carping about the release of Open Data and that is where my comment about “very last year views” arises.

Some still seem to be  resisting Open Data because the data may not be accurate or complete, but then what data is? Errors in the data are more likely to be corrected if the data is exposed to scrutiny. Others question whether the costs of opening up data will be justified by the social or economic benefit generated, like many other decisions on policy no one really knows how this will play out but the costs of opening up data need not be that large and the democratic gain although difficult to price should substantially outweigh any costs. Some, myself included, have questioned whether people will misinterpret or misrepresent the data, through lack of understanding or deliberately, but opening up data does not have to mean that government should not continue to present its interpretation of the data and people will learn who to trust.

At the other end of the spectrum are those who do not think we have gone far enough either in the scope of data released, for example local government data, rail timetables or UCAS data on universities. Some also are arguing that the data needs to be better structured and published in machine discoverable formats or linked data. There is no doubt that there is a lot more that could be done but the damn has been breached and more data will flow, not everything needs to be a feed, even a csv will do (just no pdfs please).

A short digression on FoI. I was talking to someone in local government who was bemoaning the cost and effort involved in responding to FoI requests some of which he felt were pointless or were businesses searching for opportunities or competitive information. Perhaps some of the work load could be solved by publishing the majority of the council’s records as Open Data and then directing the majority of FoI requests to the appropriate data sets and leaving the analysis to the enquirer.

Charlie & Fred with thanks to Dunechaser

A snapshot view from a wholly converted sceptic, perhaps even a personal manifesto.

  • We have learnt that we cannot always trust our governments, Open Data is the foundation of transparency, transparency will drive accountability, holding our government accountable is the cornerstone of participative democracy.
  • There are few facts and many shades of opinions, Open Data allows many analyses  and interpretations, some may be malicious, most won’t be. It isn’t very different to newspapers, we will learn which ones to trust.
  • If we really trust the innovation of the crowd opening up data need not be colossally expensive, quick and dirty (but machine readable) is all we need to get started. Most people who talk about the costs of publishing data are probably looking for excuses not to.

In a few years we will look back on Gordon’s Day and the opening of data and wonder what all the fuss was about. It would be nice if by then we have degeeked the whole thing but maybe that’s a bit much to hope for.


3 thoughts on “Open Data – the pros, the cons, a bit of FoI and a personal manifesto

    • steven

      Selective quotes from the article that you linked to Harry.

      Many – 36% at last analysis – have published their spending in PDF format only … Most of them are Conservative councils … The PDF issue is the biggest problem … publishing on PDF allows you to appear open without actually being open

      However I agree with you, get them to publish any old how and then go back to get them to publish in a machine readable format

  • Mark Percival

    There is no doubt that opening government data, both central and local, is beneficial to those organisations. Having worked in LG all my working life, and a lot of that in data management, I have always been of the opinion that data created (or information related to data processing) should be made available to whoever wants it i.e. we should have nothing to hide, being accountable to the public and other stakeholders brings responsibilities and the major one is Accountability.

    The argument that they [whoever] will not understand how to use it or will misuse it is, as you say, an excuse to not do the ground work which is inevitable when a new process is required. This ethos is fundamentally wrong! To think that those who work in this area in government “know best” is to undermine the intelligence of the masses and to somehow put themselves above all others. IMHO, those who aren’t capable of interpreting data will not be bothered about it anyway (they probably have a life!).

    As for the difficulty of putting the data out there: most will be held in very sophisticated databases and it is a small step to set up an automated process for retrieving it, yes it will take a bit of time and effort, but it is more than likely a one-time process – anyhow, I would be very surprised if most of it hasn’t been done already, looking at the offing’s on the data sites a lot of the data will have been processed for use internally and is a simple matter of posting it. Incidentally, I think this is why the errors exist – it isn’t subject to quality checks, it is ‘assumed’ that it is okay! And this brings my on to the most important point: if this data is published in the same state it is used within the organisation, what does it say about the policy and strategic decisions being based on that data?

    In my previous employment I had a major dataset that fed into a national dataset (you probably guessed what this is by now) and rather than struggle on with getting every entry correct on my own, I released it to all in sundry, and guess what? I had hundreds of colleagues coming back with corrections – mass quality checking, you can’t beat it! This is what I think will happen with ‘opened’ data, if it is wrong, users will some come back with issues and the cycle of updating, quality checking and publishing will begin ad infinitum.

    We in the public sector should not be afraid of this; we should see it as an opportunity to ‘test’ our data and up the quality.

    Just my tuppence worth!

Comments are closed.