Attribution
The dataset was made by downloading each individual zip file from 2010-2018 found on the
United States Environmental Protection Agency's website and then combining each years information into one spreadsheet.
The files can be found by scrolling down to "Annual Summary Data" and clicking on each link under "AQI by County".
Licensing
Data and Licensing Information for the EPA can be found
here. The data for this project has a U.S. Public
Domain license.
Size
The final combined dataset is 50 KB large with 951 rows and 13 columns as of May 2019.
Content
State: The state where the county is located.
County: The location of the monitoring site.
ID: The FIPS county code.
Year: The Median AQI of the county for that specific year.
Change: The county's AQI in 2010 subtracted from the county's AQI in 2018.
Important note: The second visualization was made by keeping each year's data in separate files, but the contents used are the same as listed above.
There are a few more columns included in the data, however, those columns were not used in the visualizations.
The extra columns record the amount of "Good", "Moderate", "Unhealthy", and "Hazardous" days the county had that year.
Initial processing
Once the county's data for each year was combined, any counties that did not overlap across all 9 years were removed.
The "Change" column was manually created in excel by taking the entries from 2010 and subtracting them from 2018 entries and storing them in a separate column.
The FIPS codes were taken from the United States Department of Agriculture's
County FIPS Codes
and matched to the respective county in our dataset.