Information Visualization: Interpretations and Stories around them.


Nine shared this great presentation from Gurman titled:

When Statistics become stories

It was part of her talk given at DesignUp 2019. In one slide she talked about irregular age spikes we have around multiple of 10s.

I am thinking of creating an exercise around this for the AI-ML workshop to be conducted later this month at NID Gandhinagar for New Media Design Students.

At this stage of workshop, we would have covered basic concepts around programming and Jupyter notebooks.

Section One - Introducing Pandas

I have got French population and age distribution from here, and we have cleaned it to following structure:

Out[115]: 
   year   males  females   total  age
0  2018  364155   347749  711904    0
1  2017  370453   355472  725925    1
2  2016  378518   363162  741680    2
3  2015  387906   372402  760308    3
4  2014  399232   387042  786274    4

We would start with loading this data and introduce concepts of:

  1. Reading the data(in this case from csv file using, read_csv).
  2. Exploring the structure of data(DataFrame), accessing it, using Rows, Columns.
  3. Try basic operations over the data to answer some questions, like, for which age spectrum, male population is more than females and vice versa.
  4. We would explore the concept of using ? for getting access to documentation of the method/attribute.

Section Two - Plotting the data

After having played around with the data and different methods we would shift to plotting it and try to see if we can answer questions we had explored in previous section using the plots.

I am thinking of introducing them to plotting Pie Charts, Bar graphs, Lines. Age distribution of country is generally represented in Population Pyramid, here we would try to plot the same Pyramid for French population.

Section three - Exercise for students.

A similar UK age distribution of the population is available here. We would apply things we have learned in above two sections and ask the students to plot Population Pyramid for UK.

Section four - Census and Age distribution of Indian population:

Akash Gutha has a repository and a IPython notebook that:

  1. Fetches relevant data(excel sheet) from Indian Census site.
  2. Cleans up the data and assign names to the columns, and related plots.

We would work on top of those steps to:

  1. cover how Census releases data and an accompanying guide that helps people make sense of it.
  2. Plot Population Pyramid graph for India.
  3. Observe the difference between population distribution for India and UK/France.
  4. Also have an open discussion around the spikes for certain age.
  5. Share the screenshots from Gurman's presentation that explains the spikes.

At this point we conclude the session around handling data, information visualization. Possibly we will follow it with more hands on exercise for students.

Setting up an environment for a workshop based on Python.


I distinctly remember, while working at FOSSEE back in 2009-10, when we would conduct hands on workshop in the labs of various institutes, we would factor in significant time to reach early and setup all the dependencies on the lab computers. Back then we would use Enthought's binaries for Windows system to install everything. If we were lucky we would also find Linux machines in the lab and that would help a lot as we were really comfortable with installing the requirements using a CLI.

Recently we scheduled an AI/ML workshop for New Media Design students at NID Gandhinagar. While preparing for it I was looking for resources. I knew about Project Jupyter and IPython notebooks but my understanding of them was very limited.

I found that JupyterHub is brilliant project in terms of setting up the complete environment and sharing the resources with all the students. Their offering of the-littlest-jupyterhub which is targeted for 1-100 users hosted on single server is perfect. However it does need sudo and root privileges to segregate user environments. If at NID campus we get access to a server, I will try and see if I can set it up.

Otherwise, I also came across Colab from google, that comes with all dependencies, libraries installed to be used and shared with the students. It looks really promising. I will try to put together some notebooks and exercises around the concepts we would be covering and see how both these solutions fare.

But compared to the manual setup we used to do back then, this looks like a cakewalk.

Communications


I recently got into a tense conversation with a friend. We were talking about education and I was briefing him about some popular steps a particular government was taking. During that conversation, I think my friend was trying to make the case that the things I was mentioning weren't directly related to improving the quality of education or for the students and teachers. He was right. But at that time, I didn't realize that and got defensive in a way that derailed the whole conversation.

Lately, I have noticed that many times I don't completely understand what's being said and I end up interrupting the conversation. Things escalate from there. It is uncomfortable, tense, exhausting, tiresome and worst of all, the topic of conversation gets sidelined. Furthermore, even from my side, when I am trying to express myself, often I would use the wrong word. I think my communication skills need lot more work, practice and I have to be more mindful about it.

This is one reason I like these writing-club sessions. Writing is a good exercise, it clears out the noise and makes you more focused. I have been slacking lately on these sessions but I will try to improve on that front too.

SystemD Dependency Tree


At Senic, we have shifted to systemd for managing many independent application we have running on the Hub. Earlier we were using supervisord and for bunch of reasons(limit dependency, system supported solution etc). systemd provides many strong features, thing like:

uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux control groups, maintains mount and automount points, and implements an elaborate transactional dependency-based service control logic

We have put together different service files that starts applications as Hub boots. Some of these services have hard dependencies on others, meaning if parent service is not running, child service won't start/run. For example if we have an application which is making network request, in some scenarios it will help if that service is dependent on NetworkManager service which manages Network interfaces(or other native service which handles network connections).

This dependency tree has both benefits and issues. For us, some of the services(parent service), initializes DBus Objects. And child services connects or subscribe to these Objects, that enables DBus communication between separate applications. Now if Parent service dies(SIGTERM), child service can't continue and needs to stop. Here the systemd dependency tree takes care of this for us, it stops all dependent services if parent stops.

But in situation where parent service restarts, I would say, my understanding of systemd fails me. systemd correctly stops all the child services but it doesn't restart them once parent service starts again. I am not sure which dependency construct to use that (Before, After etc) make sure that once parent service restarts, all child process also restart.

All the services have a Restart clause to make sure that service restarts. But restart only happens in some certain scenarios. If a service is stopped using command systemctl stop service-name.service, systemd won't start the service again. And I think this is how child service gets stopped when parent service restarts and hence they don't restart. Maybe.

Working in someone else's kitchen


Yesterday I was pairing remotely with one of my colleague. He hosted a tmate session for me on his system. His preference of editor is Vim and I use Emacs. We were discussing some ideas on functions and what they would do and taking turn on writing the code. I know little bit of Vim, but my muscle memories are not tuned for Vim as much as they are for Emacs. So it took a while for me, I asked some silly questions on how he was doing certain things and it was nice how he was comfortably using the interface.

Today morning as I was preparing breakfast and looking for the tools in the kitchen it reminded me of yesterday's pairing session. In kitchen its the food and code in case of work. Just the tools are placed in different location and there are other ways of preparing things.

Both these exercise brings you out of your comfort zone. The keybindings for saving, editing, navigating are different in the editor. In kitchen, spices are in different box, the box itself is placed in different location, they grate the ginger instead of crushing it. It makes you more alert and self aware.