The data set (4,897,527 rows, with 34 columns) is a database of SF Fire Department Calls for Service, representing the fire unit responses to calls. Each row represents a call that has values from Location, Call Type, priority, response time, among many more.
We collected our data by doing some analysis on the source data and calculating average data values for each zipcode. We did this by first, filtering the data by year. We then collected all the zipcodes that are present and calculated specific values:
For visualization 1, we calculated the average response time by calculating the difference between the Received DtTm
and On Scene DtTm
in seconds. We then calculated the distance from downtown using the geo coordinates, using the Financial District (lat: 37.7946, lon: -122.3999) and calculating the distance in miles.
Example Visualization 1 entry for 2010:
{ zipcode: '94102', averageResponseTime: 713.8682680800136, totalResponseTime: 16701662, incidentCount: 23396, neighborhood: 'Tenderloin', distanceFromDowntown: 0.966178923121625 }
For visualization 2, we gathered the type of incidents that are reported, starting from 2014. This is because prior to 2014, that data column Call Type Group
was not collected.
Example Visualization 2 entry for 2015:
{ zipcode: '94103', callCount: 37584, neighborhood: 'South of Market', Alarm: 6908, Potentially Life-Threatening: 20764, Non Life-threatening: 9155, Fire: 757 }
We collected this data by running a separate node.js program which enabled us to process large quantities of data, and created JSON file outputs to use for D3.