A couple of days ago a friend asked me if I could help a friend of his make a map showing the locations of US steel plants against US 115th Congressional Districts (2016). I tweeted out asking my American geogeek friends if they could help me find the data for this map. Within a few hours I had the Congessional District boundaries and the 2016 election results but the steel plant locations were a little bit more challenging, no one seemed to be able to point to a dataset. I found a pdf document showing locations for US steel mills but that would require all of the plant locations transposing to a spreadsheet and then geocoding – I’m not sure how open this data is so apologies if I should not have used it.
Yesterday (4th March 2018), while watching Arsenal lose their 4th game on the trot away to Brighton who I thought would be a chance to break our run of bad form, we were 2-0 down after 20 odd minutes, pulled one back but never looked like equalising, I decided to start wrangling data to make a map.
Joining the District boundaries with the election results was a simple task in QGIS, and then a couple of clicks and there was a classic red and blue election map.
The data on US steel plants is at the foot of the pdf in a series of columns, I tried selecting the data and copying and pasting it into a spreadsheet or document but just got a jumble of text which would have taken as long to straighten out as retyping it. I vaguely remembered that there was a service that converted pdf tables into a spreadsheet, I found a couple that didn’t seem to be able to recognise the table in the pdf document (I think the map at the top confused their services). Then I stumbled on Tabula, this is a superb open source tool compiled for Mac (thanks) and Windows – within a couple of minutes I was selecting a column of data and converting it into a .csv file for further wrangling in Excel. 10 minutes later I had all of the data combined into an Excel spreadsheet with odd characters/symbols cleaned up and subheadings converted into columns. Now to geocode the spreadsheet, I used this script from Will Geary which worked a treat 🙂
Then I just needed to do a bit of tweaking in QGIS and it was job done. Finally I thought I would upload the map into Carto so that I could share with you, it’s a first draft and needs a good bit of work, [
but you can certainly see a pattern – most of the steel plants (and all of the large ones) are in Republican Districts – hardly a big surprise but there you go. I guess big tariffs will play well with hardcore Trump supporters.] Deleted 7/3 I jumped to conclusions too quickly, see comment below. I will do some more analysis and update soon.
The map is better viewed “full screen” (the diagonal arrows below the zoom buttons).