Skip to main content

Information Visualization: Interpretations and Stories around them.

Nine shared this great presentation from Gurman titled:

When Statistics become stories

It was part of her talk given at DesignUp 2019. In one slide she talked about irregular age spikes we have around multiple of 10s.

I am thinking of creating an exercise around this for the AI-ML workshop to be conducted later this month at NID Gandhinagar for New Media Design Students.

At this stage of workshop, we would have covered basic concepts around programming and Jupyter notebooks.

Section One - Introducing Pandas

I have got French population and age distribution from here, and we have cleaned it to following structure:

Out[115]: 
   year   males  females   total  age
0  2018  364155   347749  711904    0
1  2017  370453   355472  725925    1
2  2016  378518   363162  741680    2
3  2015  387906   372402  760308    3
4  2014  399232   387042  786274    4

We would start with loading this data and introduce concepts of:

  1. Reading the data(in this case from csv file using, read_csv).
  2. Exploring the structure of data(DataFrame), accessing it, using Rows, Columns.
  3. Try basic operations over the data to answer some questions, like, for which age spectrum, male population is more than females and vice versa.
  4. We would explore the concept of using ? for getting access to documentation of the method/attribute.

Section Two - Plotting the data

After having played around with the data and different methods we would shift to plotting it and try to see if we can answer questions we had explored in previous section using the plots.

I am thinking of introducing them to plotting Pie Charts, Bar graphs, Lines. Age distribution of country is generally represented in Population Pyramid, here we would try to plot the same Pyramid for French population.

Section three - Exercise for students.

A similar UK age distribution of the population is available here. We would apply things we have learned in above two sections and ask the students to plot Population Pyramid for UK.

Section four - Census and Age distribution of Indian population:

Akash Gutha has a repository and a IPython notebook that:

  1. Fetches relevant data(excel sheet) from Indian Census site.
  2. Cleans up the data and assign names to the columns, and related plots.

We would work on top of those steps to:

  1. cover how Census releases data and an accompanying guide that helps people make sense of it.
  2. Plot Population Pyramid graph for India.
  3. Observe the difference between population distribution for India and UK/France.
  4. Also have an open discussion around the spikes for certain age.
  5. Share the screenshots from Gurman's presentation that explains the spikes.

At this point we conclude the session around handling data, information visualization. Possibly we will follow it with more hands on exercise for students.