Mapping US Steel while Arsenal lose (again!) 2

A couple of days ago a friend asked me if I could help a friend of his make a map showing the locations of US steel plants against US 115th Congressional Districts (2016). I tweeted out asking my American geogeek friends if they could help me find the data for this map. Within a few hours I had the Congessional District boundaries and the 2016 election results but the steel plant locations were a little bit more challenging, no one seemed to be able to point to a dataset. I found a pdf document showing locations for US steel mills but that would require all of the plant locations transposing to a spreadsheet and then geocoding – I’m not sure how open this data is so apologies if I should not have used it.

Yesterday (4th March 2018), while watching Arsenal lose their 4th game on the trot away to Brighton who I thought would be a chance to break our run of bad form, we were 2-0 down after 20 odd minutes, pulled one back but never looked like equalising, I decided to start wrangling data to make a map.

Joining the District boundaries with the election results was a simple task in QGIS, and then a couple of clicks and there was a classic red and blue election map.

The data on US steel plants is at the foot of the pdf in a series of columns, I tried selecting the data and copying and pasting it into a spreadsheet or document but just got a jumble of text which would have taken as long to straighten out as retyping it. I vaguely remembered that there was a service that converted pdf tables into a spreadsheet, I found a couple that didn’t seem to be able to recognise the table in the pdf document (I think the map at the top confused their services). Then I stumbled on Tabula, this is a superb open source tool compiled for Mac (thanks) and Windows – within a couple of minutes I was selecting a column of data and converting it into a .csv file for further wrangling in Excel. 10 minutes later I had all of the data combined into an Excel spreadsheet with odd characters/symbols cleaned up and subheadings converted into columns. Now to geocode the spreadsheet, I used this script from Will Geary which worked a treat 🙂

Then I just needed to do a bit of tweaking in QGIS and it was job done. Finally I thought I would upload the map into Carto so that I could share with you, it’s a first draft and needs a good bit of work, [but you can certainly see a patternmost of the steel plants (and all of the large ones) are in Republican Districts – hardly a big surprise but there you go. I guess big tariffs will play well with hardcore Trump supporters.] Deleted 7/3 I jumped to conclusions too quickly, see comment below. I will do some more analysis and update soon.

The map is better viewed “full screen” (the diagonal arrows below the zoom buttons).



Leave a comment

Your email address will not be published. Required fields are marked *

two + thirteen =

2 thoughts on “Mapping US Steel while Arsenal lose (again!)

  • Jonathan

    > most of the steel plants (and all of the large ones) are in Republican Districts

    Are they? When I zoom in to many the plants appear to be in small, blue areas – Democratic; e.g. Chicago, Detroit, Cleveland, Pittsburgh. I attribute that to (an assumed) correlation of steel manufacturing and urban areas: urban => generally more Democratic; rural => more Republican.

    Perhaps you have generalized from the initial, small scale, appearance? If you made your input data (lat/lon of steel plant; district, party of district) your conclusion might be justified but I’m not getting that from the map.

    One might draw some conclusions about those very large capacity steel plants in Mexico and Canada; but that would have a different basis than political favoritism.

    • Steven Post author

      Mea Culpa 🙁

      In my rush to get a blog post out about making the map of US steel plants I ignored my own advice to mapmakers

      “If your map says what you thought the data would show you are probably reinforcing your own bias. Make a cup of coffee and think again”

      I will do a bit more analysis of the data and report back