The data I used was gratiously accumulated and provided for use by DataSF under the San Francisco goverment, a link to the original database can be found
here.
DataSF has all their data lisenced under a “Public Domain and Dedication License” (PDDL), a summary of the license can be
found here
and the full license can be seen here
The size of the original .cvs file I downloaded is 2.5 MB with 13 columns and 11,461 rows. I will be using all 13 columns with the
exceptions of the "Incident number" and the "PdId" columns.
For my initial processing, what I did was filter down the dates of the data. In this data it includes all records from 2003 to may 2018
first I filter it down to the data with the date AFTER January 1, 2018. The next thing I did was place multiple filters to only leave the
top 16 most common incident Categories.