SF Fire Department service calls:Data analysis for the year 2018

The Rad Grads

Data and Processing

About the Original Dataset

The original dataset is from the following source: SF Fire Department Calls for Service

We are using the dataset under the terms of the dataset's ODC PDDL license, which is " intended to allow you to freely share, modify, and use this work for any purpose and without any restrictions."

The dataset contains data about the calls, San Francisco Fire Department has received on the year 2018. It originally has 34 columns and currently has 4.89M rows. Each row corresponds to a call the SF Fire Department has responded to.

Data Processing and Discussion

The original dataset was filtered to grab only the data from the year of 2018 (~300k rows). For Roger's first prototype, we applied a sum filter of 1700 to get rid of some of the call types. For his second prototype, he removed all medical incident call types.

Divya's first prototype grouped all the data based on the neighborhood with no filter beyond the year. Her second protoype, she filtered data for each neighborhood for the call type group -'Potentially life threatening' using Excel. She required the creation of a new field which calculated the average travel time of an area using the "Create Calulated Field" option in Tableau with the resultant value being the difference between on-scene time and dispatch time.

Tracy's first prototype has no filter beyond the year a sum filter of 1700 for call type. Her second prototype uses a random subset of data (size of 2500 entries) from the whole 2018 dataset, and added two calculated columns - response time and hospital transport time. Response time was calculated by using the following calcuation: ABS(DATEDIFF('second',[Response DtTm],[On Scene DtTm]))/60, and hospital transport time was calculated by using the following: ABS(DATEDIFF('second',[Transport DtTm],[Hospital DtTm]))/60. She used seconds instead of minutes in order to preserve the resolution of the line density. Click here to look at the script and click here to see the resulting dataset that is being used.