Final Project

David Mendez

Data and Processing

The data I used was gratiously accumulated and provided for use by DataSF under the San Francisco goverment, a link to the original database can be found here.
DataSF has all their data lisenced under a “Public Domain and Dedication License” (PDDL), a summary of the license can be found here and the full license can be seen here
The size of the original .cvs file I downloaded is 2.5 MB with 13 columns and 11,461 rows. I will be using all 13 columns with the exceptions of the "Incident number" and the "PdId" columns.
For my initial processing, what I did was filter down the dates of the data. In this data it includes all records from 2003 to may 2018 first I filter it down to the data with the date AFTER January 1, 2018. The next thing I did was place multiple filters to only leave the top 16 most common incident Categories.