Class Exercise – Difference between Infographic and Data Visualization

I was thinking about the NM3229 course and realized that the course primarily taught us how to design data visualizations and infographics. Before this course, I was under the impression that these 2 things were one and the same. However, through this module, I have realized that there is a fundamental difference between these 2 concepts. I suddenly thought of the class exercise where we had to discuss the differences between these 2 concepts and thought that I should blog about it since one of my primary takeaways from the course has been to learn the difference between infographic design and data visualization design.

Data visualizations, as the name suggests, are simply visual representations of data. They includes graphs, charts, maps, pictures etc. When someone looks at a data visualization, he/she is left to draw the relevant conclusions. The visualization simply depicts the data graphically and in an understandable fashion. The task of deriving relevant information and conclusions from the visualization is left to the person looking at the visualization.

Infographics, on the other hand, contain multiple visualizations, text etc to convey specific information to a reader.  Infographics tell a story to a reader. By looking at an infographic, a reader can draw very clear conclusions about the data. Infographics usually consist of multiple data visualizations to clearly depict information.

 

 

Class Exercise – Flickr Thought Experiment

This week’s NM3229 lecture was pretty action packed as we had a guest lecture by Mr David Ayman Shamma, a senior research scientist at Yahoo! Research in USA (his bio can be found at Bio). Mr Shamma gave a very interesting lecture on what methods are used to analyze and visualize photographs uploaded to social media websites such as Flickr. He demonstrated some innovative techniques to analyze the photos uploaded by various communities on Flickr and also how the data related to the photos could be visualized.

The class exercise for the lecture was to think of a novel way to visualize the data associated with 1.2 millions photos taken in Singapore. For each photo, the following information was available:

  1. Location
  2. Time at which photo is taken
  3. Time at which photo is uploaded to Flickr
  4. User who uploaded the photo

The class split into groups, each group having around 3-4 members. All groups were given 20 minutes to think of a motivating question they would like to answer with their visualization and also to make a rough sketch of the visualization.

My group decided that taking pictures of food is very popular in Singapore. Moreover, getting the location where food pictures are taken and at what time of day they are taken could depict which are some of the most popular eating places in Singapore and at what times of the day are they most crowded. The motivating question we decided to answer with our visualization was – What are the food habits of Singapoeans?

Our rough sketch of the visualization looked as follows:

20140328_134141

We conveyed the following information with this interactive visualization:

  1. Locations in Singapore where food photos have been taken. These are represented using a marker in a map of Singapore.
  2. Slider to depict time of day – You can slide to a particular time of day and the photos taken at that time of day will appear at their corresponding locations. Colour codes on markers are used to indicate different times of day. For example, black marker indicates midnight.
  3. User track – by clicking on an individual marker, a connecting line between that photo and other photos uploaded by the same user appears helping you track where in Singapore a user is eating. It could provide interesting information about the eating habits of Singaporeans.

Overall, Mr Shamma and the class were appreciative of our efforts. Mr Shamma mentioned that our visualization was comprehensive and conveyed sufficient meaningful information. The only recommendation he had was to limit the time slider to blocks of hours in a day (eg: breakfast time – 7 am to 11 am , lunch time – 11.30 am to 2 pm etc) instead of for every hour in the day.

The other groups in class also made interesting visualizations. Here is a list of their motivating questions + visualizations:

1. What are the popular hotspots visited in Singapore from 2004 – 2014?

20140328_134011

2. What is the difference between time a photo is taken and time it is uploaded to Flickr?

20140328_134022

3. What are good places in Singapore to take light photos?

20140328_134038

4. What are the most popular colours photographed in Singapore?

20140328_134054

5. What do Singaporeans do over the weekend?

20140328_134109

Overall, it was a really fun lecture and I was able to learn more about the analysis and visualization of data from photographs through the lecture and class exercise.

 

 

 

 

 

 

Class Exercise – Singapore General Elections Tracker

Its my first ever blog post on my first ever blog and this post is all about the first class of my first ever data visualization course in NUS. Talk about a whole lot of firsts! While first attempts can be extremely daunting, they are also exciting, enriching and enjoyable. Which is why I am super excited about this blog and my data visual visualization class. Hopefully, by the end of the course, my blogging and data visualization skills  will have improved by leaps and bounds.

As a student, I inevitably use infographics , pie charts , bar graphs etc to represent useful data in project reports. But the visuals I create are extremely simple and quite amateur like.  There are so many incredibly talented people in the world who represent data in such creative ways. The visuals they create are imaginative, elegant and most importantly extremely understandable. This course should definitely help me enhance my data visualization skills. Perhaps by the end of the course, my handy work will no longer be considered amateurish.

The first class of the course saw us discussing terms such as data, visualization and infographics. The highlight of the class for me was the TED talk we watched on a father trying to understand his son’s speech patterns. The most fascinating part of this talk were the different ways data was visualized and represented. Trails to depict household activities and graphs to indicate where certain words were used most often were used to represent and understand how a 2 year old boy learned to speak English. The video was an insight into how huge volumes of data should be analyzed, categorized and represented in an understandable format.

Check the video out at: The birth of a word

Our first ever class assignment was to examine the Singapore 2011 general elections tracker and try to infer useful information from it, a seemingly trivial task but complex nonetheless as the tracker was rather difficult to comprehend initially.

Here is what I inferred from the tracker:

1. The tracker represents the most popular terms or terms searched for most often by the Singapore public in the days leading up to the general election 2011.

2. Popularity of a term over time is depicted using color coded graphs.

3. Relationships between popular terms and related articles and tweets is depicted using color coded graphs.

There were many elements of the tracker that I was impressed with and a few elements that I did not like.

The positives:

1. I was impressed by the graphs used to depict trends in popularity

popularity

For example, in the image above, it is evident that the term PAP is very popular and has remained popular over the timeline of 5th May to 8th May. The usage of graphs made it very intuitive to understand how the popularity of a term had changed over time.

2. Headings such as ‘Running’ , ‘Key Terms’, ‘Latest’ gave a clear picture on the latest and most recent terms.

3. Relationships between terms and related articles or tweets were very well defined.

4. The interactive user interface made it easy to click on a term and view its popularity + related tweets and articles.

video

For example, the image above very clearly shows the change in popularity of the word ‘video’ on the left hand side and also depicts articles and tweets with the word ‘video’ on the right hand side.

5. The tracker represents a large amount of data with minimum graphics. It does not add any unnecessary elements just to look attractive.

The negatives:

1. The numbers put next to the key terms are not easily understandable.

numbers

For example, in the image above, the use of the numbers 11 , 2 , 10, 3 is not easily understandable. Only after a little exploration of the visualization, it becomes evident that the numbers depict changes in popularity over time. For example, on 7th May, WP was the 11th most popular term but it improved to the 2nd most popular word on 8th May.

2. While the tracker claims to fairly depict the most shared content on social media, Twitter appears to be the only source that is well represented. If Twitter is the only social media source utilized, the tracker may not be a very fair representation of the truth.

3. The colors used for the graphs are misleading. It is not very evident as to what is the purpose of the color codes used for various graphs.

My suggestions for improvement:

1. An easier color coding on the graphs. For example, green to indicate increase in popularity and red to indicate drop in popularity of a term.

2. More social media sources such as Facebook, Quora, Pinterest and Google+ for the data used in the visualization.

3. Use of a horizontal bar graphs to more clearly indicate the popularity of a term or how many times the term has been shared or been searched for.

bar graph

If you would like to check out the tracker, the link is: Singapore General Elections 2011

On the whole, the first class of data visualization was interesting and I am excited about what is in store for the rest of the semester!