The Epidemic of Unemployment

Sameer Isaq

Data and Processing:

20 Years of Unemployment

Terms:
state - the name of the state
id - the custom ID number each state has
rate - a state's respective unemployment rate
national - the national unemployment rate for a given year
population - the population of a state (2010-2018)

For my primary visualization, all of the unemployment data I aggregated can be found here through the United States Department of Labor: Bureau of Labor Satistics. In order to allow the user to have the ability to sift through 20 years of data, I began at January 1999 and took the state unemployment data from January of each year.

After a bit of aggregation, the simpliest iteration of the csv data looked like so:

          state,rate
          Iowa,2.5
          New Hampshire,2.7
          North Dakota,3.4
          ...
        
This was an easy beginning for the project, so I then needed some sort of geographical data to map the data to. Through Mike Bostock's visualization I found a link to a topological dataset of the United States. The link to this dataset can be found here.

The difficulty I found when attempting to visualize the topological data is that there were no state names included in the file, but rather state codes. Therefore, in order to correctly map data to a state, I needed to add an additional column containing Federal State Codes in the CSVs. Using this reference, I added the additional column. My CSVs then looked like so:

          state,id,rate
          Iowa,19,2.5
          New Hampshire,33,2.7
          North Dakota,38,3.4
          ...
        
Following this, I wanted to allow users to see an individual state's unemployment rate, but also allow them to compare that to the national unemployment rate for a given year. This meant adding an additional column. Thankfully, all the unemployment files I was using had the national unemployment rate as the first listed piece of data in each file. Therefore I was simply able to copy that into each cell of the CSV to allow users to compare.
Thus making my CSVs look like so:

          state,id,rate,national
          Iowa,19,2.5,4.3
          New Hampshire,33,2.7,4.3
          North Dakota,38,3.4,4.3
          ...
        
Allowing me to map data to states and allowing users to compare unemployment rates.

Lastly, I wanted to bring attention to my supplementary visualization, which addresses people, so I decided to include population. Not only does this allow users to get a feel for the severity of unemployment within populous states, but it also provides a segway into the latter half of my project.

Although, the data file I found from the United States Census Bureau only provided data from 2010-2018. Therefore, I wasn't able to include population statistics for every year the slider accounts for. Through this final step of aggregation, my CSV files then looked like so:

          state,id,rate,national,population
          Iowa,19,2.9,4.1,"3,156,145"
          New Hampshire,33,2.6,4.1,"1,356,458"
          North Dakota,38,2.6,4.1,"760,077"
          ...
        
Each CSV file is ~52 lines of data aggregated from several different sources in order to create my visualization.