Final Project

Tracy Cheng

Data and Processing

About the Original Dataset

The original dataset is from the following source: Police Department Incident Reports: Historical 2003 to May 2018

The geojson dataset is also from the same site: Geojson data from data.sfgov.org

I am using both datasets under the terms of the datasets' Open Data Commons Public Domain Dedication and License (ODC PDDL) license, which is "intended to allow you to freely share, modify, and use this work for any purpose and without any restrictions."

The dataset contains data about the incident reports San Francisco Police Department (SFPD) has uploaded from 2003 until May 2018. It has 13 columns, 2.21 million rows, and a size of 28.2 MB. Each row corresponds to an incident report the SFPD has filed. The columns are as follows: Incident Number, Category, Description, Day of the Week (Monday, Tuesday, etc.), Date (MM/DD/YYYY), Time (00:00 - 23:59), Police District, Resolution (Arrested or None), Address, X-coordinate (Longitude), Y-coordinate (Latitude), Location, Police District ID.

Initial Data Processing

Because the dataset is so large, I decided to download only data in the year of 2017, which is the most recent complete year of data.