A Look at The Olympics Throughout The Years

Tony Jimenez

Project Theme

My project is about The Olympics throughout the years. The narrative I am trying to show is how the Olympics have changed and what exactly is different. Overall, I want to look at the change in history with the olympics and how each country is unique and has their own strengths.

I also will be looking at Medals for each country.

Visualizations

\

Data encodings

Since there are three different visualizations embedded here I will talk about each one. The first Visualization is the choropleth map, The colors are encoded by the best medal the team has earned throug each year. I took the data and did some processing to get the best medal for each country and each year. I looked for each country, and looked at all the athletes for specific years and returned the highest of the medals. This Visualization tells you about the best Medal earned from each country in a specific year.

The second Visualization is a bar chart. This is encoded on the x axis by year, and the y axis by number of records. This data is filtered to only show the Gold medal counts for the year 1996-2016. I looked at each country and then each athlete for specific years and returned the number of gold medals.

The third Visualization is a Pie chart. This is encoded by each piece of the pie being the number of records for each type of medal. It does not work properly and only shows the data for the U.S. right now. This is also only for the 1996-2016 years and is done pretty much the same as the last visualization only it returns the nubmer of each medal.

Interaction

The interaction with this visualization is that you can hover over each country to see an outline of it and the name of the country. You can also use the slider at the bottom of the page to change the year you are looking at. If you click on a country it will show you some visualizations on it. I implemented a text area for each country which shows some stats for their entire lifetime to give perspective to the other data. I also create a bar chart for 10 of the countries with the most data. These countries are the U.S., Great Britain, Grance, Germany, Italy, China, Australia, Sweden, Russia, and Canada. A pie chart is also created and you can hover over it to get an outline of the slice and to see the exact number of records.

Findings

I will talk about my biggest finding which was pretty much the difference in medals and countries through the winter and summer Olympics. As you can see through the bar chart, you see dips every 2 years which is because less athletes participate in the winter Olympics and also when you change the year with the slider, you can see that a lot more countries do not have medals earned. This is caused because some countries don't send their athletes to the winter Olympics.

Feedback

I have included the original prototypes and the original goals I had to give perspective on my original ideas.

Prototype 1: Static Image

Goals

The Goal for this prototype is to implement it in D3. Right now I am using Datawrapper to create the prototype but it is a bit restrictive in some areas and implementing in d3 will make it a lot better. I want people to see the countries that have Olympic medals. The overall narrative that I am trying to present at this point is the difference in countries throughout the Olympics. I will be going through different measures like the amount of medals earned, people sent every year, and even age. I will try to connect all of this with the narrative of countries.

Planned Interactivity

This prototype already has some interactivity like zooming/panning, tooltips, and hovering. I will pretty much try to add all of this into my visualization and expand on it quite a bit. The tooltips right now only show the name of the country but I want to add how many medals the country has, the medal count of each tier (Gold, Silver, Bronze) and the win percentage. I think this is a good amount of interactivity and will be good. I might add filtering of countries based on medal tiers but I'm not too sure if it adds much.

Data encodings

This is a Choropleth map of the world, where the color and category of each country is encoded as the highest medal earned from each country from 1960-2018. The country receives a "None" if they have not won a medal during this period.

Prototype 2: Beta Release

Prototype 2: Beta Release(Bigger)

Goals

The Goal for this prototype is to implement it in D3. This one is pretty simple and getting interactivity working would not be too difficult. I have created the prototype with Tableau but the legend did not come out very nice and it was a bit difficult to get certain age groups to come together. It also just doesn't look good and I would like to change the legend in d3. I want people to see the age differences in certain sports. I will try to paint a certain picture of how there are certain ages that are more common and how often they show up in their specific sport.

Planned Interactivity

The planned interactivity I have for this is hovering to get some details on demand, this will give you some details like the amount of records there are for that stacked bar. I also would like the ability to sort by color and other aspects. Filtering out some bars and only giving you a specific color would be good too.

Data encodings

This is a stacked bar chart of that plots number of records and sport, The color represents a specific age range. The data is from 1960-2018. I decided to put 40+ all together as there are not very many athletes in the 40+ range, and you can tell from the graph.

Feedback Received

The most important piece of feedback I received for the alpha was the Lie factor being big(I scored low on this in the feedback exercise) and not processing my data correctly. In the picture you can see that there are some countries that are classified as not having any medals, but when you search for those countries, they do indeed have medals. This was because the naming of countries was really off. I went in and fixed a lot of that so that every country has the right name and the data gets bound correctly. So, the lie factor seems to be gone now. I also was told that filtering by year could be good so I implemented that. The feedback for the beta prototype was that I should be able to filter bars and fix the issues with size.

Feedback changes

First off, the beta prototype was not implemented. I decided on this because after the feedback exercise, I thought that it would be too awkward to have my entire project focus on medals earned, but then have a visualization that looks at the ages of the athletes.

I did implemented the changes for my alpha however. I made sure the countries were being bound correctly and I added filtering. I also added visualizations that pop up with the interactivity of the visualization. I did this because of the professors advice/feedback.

About Me

Tony Jimenez
Hello, my name is Tony Jimenez. I am a Junior in my 6th semester here at the University of San Francisco. I am majoring in Computer Science and expect to graduate in the Spring of 2020. Computers and Video Games are some of my favorite things so one of my goals is to work in that field, but realistically, working in any field would be amazing as I am already doing something I love.
Github
Contact Info:
•Email: tjimenez3@dons.usfca.edu

Why this data set?

I chose this data set because it seemed really interesting and it is something that everyone knows and has heard of in some capacity. I wanted to do something more personal to me like Basketball, but I felt like it would not connect to a lot of people and it would be better to do something that I don't know much about but would be really interesting to learn more of. Plus, Basketball is a sport in the Olympics so even if it's not fully about something I am truly passionate about, I am really interesting in it and it still connects to me in some way.

Data Set

This data set is attributed to Kaggle user rgriffin. The original link can for the data set can be found here

The original data set has 271,219 Rows and 15 Columns. The raw file size is 40,529 KB. It is data all about the Olympics and every single person that has competed in an event. It has various information on the athletes such as Age, Height, Team, Event, Sport and many more.

License

In the link provided above you can click on "CC0: Public Domain". This will redirect you to this and here it says that this data set has "No Copyright" and that you can copy, modify, distribute and perform the work without asking for permission.

Data Content

The description of each column is pretty straightforward:

  • ID: This is the unique ID given to each athlete. They use this ID for every Olympics
    Range is from 1 to 135,571
  • Name: Name of Athlete
  • Sex: Sex of Athlete
  • Age: Age of Athlete
    Range is from 1 to 97
  • Height: Height of Athlete measured in centimeters
    Range is from 127cm to 226cm
  • Weight: Weight of Athlete measured in kilograms
    Range from 25 to 214
  • Team:Team Name usually the Country the athlete is competing with
  • NOC: National Olympic Committee 3-letter code
  • Games: Year and Season in String format
  • Year: Integer Year of competition
  • Season: Summer or Winter
  • Host City: City that is Hosting the Olympics
  • Sport: Sport Athlete is competing in
  • Event: Event Athlete is competing in
  • Medal: Medal Earned
    Either Gold, Silver, Bronze or NA

Data Processing

When downloading the data, I did not do any data processing, however, there are visualizations that use the full data and some that use a filtered view so I used two sets, one of the full data which did not have any processing done on it and the other is filtered from 1996 - 2016.

Sources used

Here are some of the sources I used to create my visualizations:

1

2

3

4