You may recall my blog post a few weeks back about the desperate need for something/one to break the logjam around open access to UK addresses and inviting people to get in touch to discuss building an Open Address File (OAF). It prompted quite a lot of traffic and generally supportive comments. The nice folk at the Open Data Institute reached out to share details of their application to the Cabinet Office for funding to produce an Open Address product. Last week it was announced that ODI had received funding for the first phase. In my enthusiasm I tweeted a link to the news and returned a little later to something of a tweet storm which included
@owenboswarva @UKODI we need two address products the same way we need multiple land registries #bangheads not #petprojects
— James Cutler (@GeoSpaceJames) June 30, 2014
Reading all of the naysayers I realised that I agreed with them! Of course we don’t want multiple competing address files. But and of course there is a but I agree with Jeni Tennison, technical director of the ODI, even more
@grahamhyde @emapsitejames @stevenfeldman @owenboswarva @ukodi I agree, but a) already are multiple products & b) single closed one is worse
— Jeni Tennison @JeniT@mastodon.me.uk (@JeniT) June 30, 2014
I could continue the abused quote from Hamlet by adding
Whether ’tis Nobler in the mind to suffer
The Slings and Arrows of two closed address files,
Or to take Arms against a Sea of troubles,
And by opposing end them? To be open, to innovate
And there is the heart of the debate. After almost a decade of address wars we have not got an open address product, despite reports, encouragement, head banging etc government was unable to get two of its agencies to cooperate and deliver a single address solution let alone make that solution open. Now that Royal Mail has been privatised and rumours abound regarding the future of Ordnance Survey how likely is it that these two behemoths are going to agree on a way of making an open address product?
The arguments against public funding of an Open Addressing product or service range from “we don’t need another address source” to “it won’t be accurate enough to be any use” via “it can’t be authoritative” with a bit of “government shouldn’t be funding this” tossed in.
Time to recall Donald Sutherland’s line in Kelly’s Heroes
Why don’t you knock it off with them negative waves? Why don’t you dig how beautiful it is out here? Why don’t you say something righteous and hopeful for a change?
An OAF won’t initially be complete but like any crowdsourced dataset it will become more complete rapidly as people join in to fill the gaps. My bet is that it will be ‘good enough’ for many use cases outside of the public sector within a year (the public sector doesn’t have the impetus to break the log jam because they are now receiving AddressBase as part of the PSMA with the cost being hidden from everyone except BIS and DCLG).
An OAF will never be authoritative but does that matter, remember that the ONS had to produce their own conflated address database to support the 2011 census because they did not consider any of the existing products to be sufficiently complete. What applications would need an address file to be authoritative? Do either RM or OS ‘warrant’ their products as authoritative?
The funding that the Release of Data Fund is providing (see proposal here) if they support phase 2 as well will be less than 1% of the sums charged by OS and RM for their addressing products which are based upon publicly curated data from Local Government. Addresses have been repeatedly identified as one of the key parts of the information infrastructure or core reference data that is blocked by government’s inability to bang heads together and overcome public sector business models that encourage dysfunctional and sub-optimal use of public data.
Maybe a competitive open product will prove to be the stimulus to the current addressing businesses to create a viable open address service at no cost to the tax payer. If that’s the case then the seed funding will be money well spent.
One thought on “To OAF or not to OAF, that is the question”
I think the “two address products” problem which James Cutler is raising, actually evaporates to nothing if you start to think of data the way modern open source coders think about their code.
Code is “forked” in git repos, and data can be too. There’s copies everywhere and no central database. It’s trivial to copy the whole thing and work on improving your own copy, but then it’s trivial for someone (like the ODI) to merge your changes back in, and (if they ever come round to such a modern way of thinking) it will be trivial for Royal Mail to merge that data with their stuff. Of course all of this copying is predicated on a nice open permissive license. But once you’re truly open, the issue of who controls it, and where the centralised repo is… is a non-issue.
However… counter-argument… centralisation issues may not evaporate entirely because in order to get the ball rolling, and make a success of a project like this, there may be a glossy website and a brand name for people to rally around, a community of contributors and users brings it’s own centre of gravity, which becomes difficult to shift.
The amount of contributors rallying required for an OAF is probably considerably less than for a project like wikipedia or OpenStreetMap though. It’s more likely to be an agglomeration of datasets from hundreds of providers rather than millions of users, but it could be worth “designing-in” some decentralised thinking into the OAF project. I have no doubt that the ODI will be planning to run this data project on a git hosted kind of basis (they love that stuff), but it might also be interesting to declare a self-destructing mission statement from the outset. I’m thinking a statement like this: “If and when Royal Mail open license their database, and accept open contributions, OAF will no longer need to exist and will be discontinued”. This might seem sort of deferential to them as an authority, but is actually the best kind of “threat of openness”. It also shows why a truly open dataset isn’t really a “second product” in the marketplace.