About the Data

Data and Processing

Data acquired from Zillow Research on April 12, 2019. "All data accessed and downloaded from this page is free for public use by consumers, media, analysts, academics etc., consistent with our published Terms of Use. Proper and clear attribution of all data to Zillow is required." Aggregated data on this page is made freely available by Zillow for non-commercial use. For my project I will follow and analyze trends in California for the past 10-20 years. I have chosen to download two related datasets for my data exploration:


1. Home Listing and Sales;
On the Zillow Rearch page under home listings and sales, I filtered my data by 'Data Type' to "Monthly Home Sales (Number, Raw)" and 'Geography' by "State". The data ranges from March 2008 to February 2019. When I build my visualizations, I will only use data from California from the years 2013 through 2018, a 6 year span. The original dataset has 138 categorical columns and 51 categorical rows. My final dataset will contain 72 categorical columns and 51 categorial rows. In order to build my visualization, I calculated the percentage of homes sold per year. I summed up the months for a specific year for a given state and multiplied it by the number of months to get the percentage of homes sold per year.


2. Home Values-
On the Zillow Rearch page under home values, I filtered my data by 'Data Type' to "Decreasing Values (%)" and 'Geography' by "State". The data ranges from February 1997 to March 2019. When I build my visualizations, I will only use data for all the states in the years 2013 through 2018, a 6 year span. The original dataset has 268 categorical columns and 51 categorical rows. My final dataset will contain 72 categorical columns and 51 categorial row. In order to build my visualization, I calculated the percentage of homes decreasing in value for a certain state in a specifc year. To attain this number, I added the number of decreasing values divided by the number of months.