A week ago I saw this tweet
Mapping how poverty in London changed from 2008 to 2013 ••• pic.twitter.com/Xr4mxzWUZy
— Barry Quirk (@BarryQuirk1) July 18, 2016
I couldn’t get any sense of what change had taken place or how the 2 maps compared, so I replied
— (((Steven Feldman))) (@StevenFeldman) July 19, 2016
Then I thought to myself, why don’t I try to make a more useful map? What follows is a summary of what I did, what worked, what didn’t. Hopefully it will be useful to someone and possibly a better map maker may even come up with a better way of doing this.
I couldn’t find the CASE UMBR data set that Barry had used. That was probably a good thing as Lower Super Output Areas (even just for London) is quite a large data set.
But thanks to The London DataStore it wasn’t hard to find some data on poverty in London. I downloaded:
There are 2 sets of boundary files because at some stage the boundaries changed (this was a source of confusion, data wrangling and I’m not sure if I have got it completely right!)
The starting point was to have a look at the data
- There is a tab for each year (plus one for metadata)
- The ward classifications changed in 2008 and then there seem to have been some further changes in boundaries after 2011. I decided that I would focus on the changes in Child Poverty for under 16 year olds between 2008 and 2013.
- There is some regional and borough data at the top of each tab which I deleted (and saved as a separate spreadsheet, just in case)
- I deleted the columns that I didn’t need
- I cleaned up the header row and created shorter field names
Sounds easy? Yes, but QGIS doesn’t handle spreadsheet imports as elegantly as one might hope 🙁 The problem was that the %age of children in each ward in poverty came through as a string not a number which meant that I couldn’t do any calculations on it. I tried everything I could think of xls, xlsx, changing formats – nothing worked. I installed the Spreadsheet Layers plugin for QGIS which made things better, the results were still rather flaky but they sort of worked.
Finally I created an extra tab to calculate the difference between child poverty rates per ward from 2008 and to 2013. Not quite as simple as I had expected because the wards changed a bit over the period, but with a bit of wrangling I managed to come up with something that sort of worked. I wanted to understand where poverty rates had reduced (or increased) so I decided to show the change in rate as a percentage (percentage changes in percentages may not be the best statistical tool but it was the best I could come up with and it magnifies the changes, anyone got a better suggestion?). This time round I saved the data as a csv (only one tab from the spreadsheet) which made it easier to import into QGIS.
Mapping the data
Once the the tabular data had been manipulated and formatted it was reasonably easy to open in QGIS and join to the spatial boundaries. I created three layers and saved them as shape files:
- Child Poverty 2008
- Child Poverty 2013
- Child Poverty Change
Thematic maps are really simple in QGIS
- Select properties for a layer and go to the Style tab
- Choose ‘Categorised’
- Select the column to thematically map (% of children under 16 in poverty for 1 & 2 above and change in %age for 3 above)
- Choosing a percentage to map means that the choropleth is normalised which will keep Ken Field happy (I am sure he will find plenty else that needs imroving)
- I chose ‘pretty breaks’ for the classes and then played around with the number of classes and colour ramps for each data set until it felt about right
- I then manually over-rode the class breaks to get nice simple ranges for the classification for each layer (e.g. 0% to 10%, 10% to 20% etc)
- Finally I made some manual tweaks to the colour ramp so that the 2008 and 2013 datasets had identical ranges and colours
To share the results with you I had to have a first try at using the QGIS Print Composer. I definitely need to spend more time learning how to use this and somehow I managed to screw things up and save my project after removing all of the thematic classes (fortunately it only took a few minutes to repeat the steps above).
Here are the maps
You can see there are some white gaps in the 2013 image which are due to the boundary changes (I think I know how to fix that but that is for another day).
So these two maps are not massively dissimilar to the LSOA maps that Barry Quirk produced. The pattern of poverty is somewhat concentrated in the inner london wards with a sort of ‘cross’ stretching north, south, east and west. Broadly it appears that child poverty rates have reduced over the 6 year period (the pinks and reds are a bit lighter signifying lower poverty rates) but there is a bit more that we can find out. This final map shows the changes in child poverty rates over the period.
The darker blue wards show the largest reductions in poverty rates and the dark red show wards where child poverty rates have increased the most. The good news is that across most of London child poverty rates have reduced, which is a bit surprising to me in the light of the austerity measures of the last 3 years of the period. It looks as if the wards that had the highest poverty rates in 2008 have experienced the greatest reductions. If I was in government I would want to understand what factors might have caused the increases (some of which are large) in a few wards.
Putting the results on the web
In case you are interested in the data and want to explore it yourself, I used the wonderful QGIS2Web plugin to publish the data as a web project.
I had to use the OpenLayers output because Leaflet doesn’t support the thematic shading of the layers, OL isn’t quite a s elegant as Leaflet in my opinion but the map works reasonably well considering the volume of data. You can on the wards and see the attribute data in the popup (I dropped the hover option that I started with because it doesn’t seem to work in mobile). The layer control allows you to switch on the 2008 or 2013 layers (best to switch off the change layer and only use one layer at a a time).
A little discovery: my ward in Haringey has one of the highest increases in child poverty rates within London! Not what I would have expected, a flaw in the data, my methodology or ….? Maybe my local councillors will be interested.
Update 27th July, 2016 – Leaflet
Thanks to the brilliant Tom Chadwin I now know that the Leaflet option was throwing a wobbly because one of my field names started with a ‘%’ followed by a space (dumb on my part). I changed the field name and now you can see why I prefer the Leaflet option, it DOES display the thematics and builds a thematic legend as well.
Update 29th July 2016 – Attribution
A humble apology, I neglected to correctly attribute the sources of the data in these maps (my thanks to Nick Duggan for reminding me)
- Contains National Statistics data © Crown copyright and database right 2012
- Contains Ordnance Survey data © Crown copyright and database right 2012