2022....


Here is a wishlist for 2022. I have made similar lists in past, but privately. I didn't do well on following up on them. I want to change that. I will keep the list small and try to make them specific.

  • Read More

    I didn't read much in 2021. I have three books right now with me: Selfish Gene by Richard dawkins, Snow Crash by Neal Stephenson, Release It! by Michael T. Nygard. I want to finish them. And possibly add couple more to this list. Last year, I got good feedback on how to improve and grow as a programmer. Keeping that in mind I will pick up some/a technical book. I have never tried before. Lets see how that goes.

  • Monitoring air quality data of Bikaner

    When I was in Delhi back in 2014 I was really miffed by deteriorating air quality in Delhi. There was a number attached to it, AQI, it was registered by sensors in different locations. I felt it didn't reflect the real number of air quality for commuters like me. I would daily commute from office on foot and I would see lot of other folks stuck in traffic breathing in the same pollution. I tried to do something back then, didn't get too much momentum and stopped.

    Recently with help of Shiv I am mentally getting ready to pick it up, again. In Bikaner we don't have official, public AQI number (1, 2, 3). I would like that to change. I want to start small, couple of boxes that could start collecting data. And add more features to it and increase the fleet to cover more areas in the City.

  • Write More:

    I was really bad with writing things in 2021. Fix it. I want to write 4 to 5 good, thought out, well written posts. I also have lot of drafts, I want to finish writing them. I think being more consistent with writing club on wednesdays would be helpful.

  • Personal Archiving:

    Again a long time simmering project. I want to put together scaffolding that can hold this. I recently read this post, the author talks about different services he runs on his server. I am getting a bit familiar with containers, docker-compose, ansible. And this post has given me some new inspirations on taking some small steps. I think the target around this project are still vague and not specific. I want some room here. To experiment, get comfortable, try existing tools.

Review of AI-ML Workshop


In this post I am reflecting on a Artificial language, Machine learning workshop we conducted at NMD, what worked, what didn't work, and how to prepare better. We(Nandeep and I) have been visiting NID from past few years. We are called when students from NMD are doing their Diploma project and need technical guidance with their project ideas. We noticed that because of time constraints often students didn't understand the core concepts of the tools (algorithm, library, software) they would use. Many students didn't have programming background but their projects needed basic, if not advanced skill sets to get a simple demo working. As we wrap up we would always reflect on the work done from students, how they fared and wished we had longer time to dig deeper. Eventually that happened, we got a chance to conduct a two week workshop on Artificial Intelligence and Machine Learning in 2021.

What did we plan:

We knew that students will be from broad spectrum of backgrounds: media (journalism, digital, print), industrial design, architecture, fashion and engineering. All of them had their own systems(laptops), with Windows(7 or 10) or MacOS. We were initially told that it would be five days workshop. We managed to spread it across 2 weeks with half a day of workshop everyday. We planned following rough structure for the workshop:

  1. Brief introduction to the subject and tools we would like to use.
  2. Basic concepts of programming, Jupyter Notebooks, their features and limitations.🔖
  3. Handling Data: reading, changing, visualizing, standard operations.🔖
  4. Introduction to concepts of Machine learning. Algorithms, data-sets, models, training, identifying features.
  5. Working with different examples and applications, face recognition, speech to text, etc.

How did the workshop go, observations:

After I reached campus I got more information on logistics around the workshop. We were to run the workshop for two weeks, for complete day. We scrambled to make adjustments. We were happy that we had more time in hand. But with time came more expectations and we were under prepared. We decided to add assignments, readings(papers, articles) and possibly a small project by every student to the workshop schedule.

On first day after introduction, Nandeep started with Blockly and concepts of programming in Python. In second half of day we did a session around student's expectations from the workshop. We ended the day with small introduction around data visualization<link to gurman's post on Indian census and observation on spikes on 5 and 10 year age groups>. For assignment we asked everyone to document a good information visualization that they had liked and how it helped improve their understanding of the problem.

For Second day we covered basics of Python programming. I was hosting Jupyter hub for the class on my system, session was hands on and all the students were asked to follow what we were doing, experiment around and ask questions/doubts. It was a slow start, it is hard to introduce abstract concepts of programming and relating them to applications in AI/ML domain. In second half we did a reading of chapter from Society of Mind<add-link-here> followed it with group discussion. We didn't follow up on first day's assignment which we should have done.

On third day we tried to pick pace to get students closer to applications of AI/ML. We started with concepts of Lists, Arrays, slicing of arrays leading up to how an image is represented in Array. By lunch time we were able to walk everyone through the process of cropping on a face in the image using concepts of array slicing. In every photo editing app this is a basic feature but we felt it was a nice journey for the students on making sense of how it is done, behind the scene. In afternoon session we continued with more image filters, what are algorithms behind them. PROBLEM: We had hard time explaining why imshow by default would show gray images also colored. We finished the day by assigning all students random image processing algorithms from scikit-learn. Task was they would try them and understand how they worked. By this time we started setting up things on student's computer so that they could experiment with things on their own. Nandeep had a busy afternoon setting up development environment on everyone's laptop.

Next day, fourth day, for the first half, students were asked to talk about the algorithm they were assigned, demo it, explain them to other students. Idea was to get everyone comfortable with reading code, understand what is needed for doing more complex tasks like face detection, recognition, object detection etc. In afternoon session we picked up "Audio". Nandeep introduced them to LibROSA. He walked them through on playing a beat<monotone?> on their system. How they could load any audio file and mix them up, create effects etc. At this point some students were still finishing up with third days assignment while others were experimenting with Audio library. Things got fragmented. Meanwhile in parallel we kept answering questions from students, resolve dependencies for their development setup. For assignment we gave every student list of musical instruments and asked them to analyse them, identify their unique features, how to compare these audios to each other.

On fifth day we picked up text and make computer understand the text. We introduced them to concepts like features, classification. We used NLTK library. We showed them how to create simple Naive Bayes text classification. We created a simple dataset, label it, we created a data pipeline to process the data, clean it up, extract feature and "train" the classifier. We were covering things that are fundamentals of Machine learning. For weekend we gave them an assignment on text summarizing. We gave them pointers on existing library and how they work. There are different algorithms. Task was to experiment with these algorithm, what were their limitations. Can they think of something that could improve them? Can they try to implement their ideas.

WEEK 1 ENDED HERE

We were not keen on mandatory student attendance and participation. This open structure didn't give us good control. Students would be discussing things, sharing their ideas, helping each other with debugging. We wanted that to happen but we were not able to balance student collaboration, peer to peer learning and introducing new and more complicated concepts/examples.

Over the weekend I chose a library that could help us introduce basic concepts of computer vision, face detection and face recognition. BUT I didn't factor in how to set it up on Windows system. The library depended on DLib. In morning session we introduced concept of Haar cascade (I wanted to include a reading session around the paper). We showed them a demo of how it worked. In afternoon students were given time to try things themselves, ask questions. Nandeep particularly had a hard time setting up the library on students system. Couple of student followed up on the weekend project. They had fixed a bug in a library to make it work with Hindi.

On Tuesday we introduced them to Speech recognition and explained some core concepts. We setup a demo around Mozilla Deep Search. The web microphone demo doesn't work quite that well in open conversation scenario. There was lot of cross talking and further my accent was not helpful. The example we showed was Web based so we also talked about web application, cloud computing, client-server model. Afternoon was again an open conversation on the topic and students were encouraged to try things by themselves.

On Wednesday we covered different AI/ML components that powers modern smart home devices like Alexa, Google Home, Siri. We broke down what it take for Alexa to tell a joke when asked to. What are onboard systems and the cloud components of such a device. The cycle starts with mics on the device that are always listening for Voice activity detection. Once they get activated they would record audio, stream it to cloud to get text from the speech. Further intent classification is done using NLU, searching for answer and finally we the consumer gets the result. We showed them different libraries, programs, third-party solutions that can be used to create something similar on their own.

We continued the discussion next day on how to run these programs on their own. We stepped away from Jupyter and showed how to run python scripts. Based on earlier lesson around face recognition some students were trying to understand how to detect emotions from a face. This was a nice project. We walked the students on how to search for existing project, paper on the same. We found a well maintained Github project. We followed its README they maintainer already had a trained model. We were able to move quickly and get it working. I felt this was a great exercise. We were able to move quickly and build on top of existing examples. In afternoon we did a reading on Language and Machines section of this blog:

Let's not forget that what has allowed us to create the simultaneously beloved and hated artificial intelligence systems during the last decade, have been advancements in computing, programming languages and mathematics, all of those branches of thought derived from conscious manipulation of language. Even if they may present scary and dystopic consequences, intelligent artificial systems will very rapidly make the quality of our lives better, just as the invention of the wheel, iron tools, written language, antibiotics, factories, vehicles, airplanes and computers. As technology evolves, our conceptions of languages and their meanings should evolve with it.

On last day we reviewed some of the things we covered. Got feedback from students. We talked about how we have improvised the workshop based on inputs from students and Jignesh. We needed to prepare better. Students wished they were more regular and had more time to learn. I don't think we will ever had enough time, this would always be hard. Some students still wanted us to cover more things. Someone asked us to follow up on handling data and info visualization. We had talked briefly about it on day one. So we resumed with that, walked them through with the exercise fetching raw data, cleaning it, plotting and finding stories hidden in them.

Resolve inconsistent behaviour of a Relay with an ESP32


I have worked with different popular IoT boards, arduino, esp32, edison<links>, RaspberryPi. Sometimes trying things myself, other times helping others. I can figure out things on the code side of a project but often I will get stuck debugging the hardware. This has specially blocked me from doing anything beyond basic hello world examples.

A couple of months ago, I picked up an esp32 again. I was able to source some components from a local shop. ESP32, Battery, a charging IC, a Relay, LEDs different resistors and jumper cables. I started off with the simple LED blinking example. I Got that working fairly quickly. Using examples I was able to connect ESP32 to a wifi and control the LED via a static web-page<link-to-example>. Everything was working, documentation, hardware, examples. This was promising and I was getting excited.

Next, a friend of mine, Shiv, he is good with putting together electrical components, brought an LED light bulb. And we thought, lets control it remotely with esp32. We referred to Relay switch connections, connected jumper cables and confirmed that with esp32 the relay LED light bulb would flip when we controlled the light over wifi. It was not working consistently, but it was working to a certain level. Shiv quickly bundled everything inside the bulb. He connected power supply to charging IC and powered the ESP32. We connected relay properly. Everything was neat, clean, packed and ready to go. We plugged in the Bulb and waited for ESP32 to connect to Wifi. It was on, I was able to refresh the webpage that controlled the light/led. So far so good. We tried switching on the LED bulb, nothing. We tried couple of times, nothing. On the webpage I could see that state of light was toggling. I didn't have access to serial monitor. I could not tell if everything on ESP32 was working. And I thought to myself, sigh, here we go again.

We disassembled everything, laid all components flat on the table. I connected ESP32 to my system with USB cable. Shiv got a multimeter. We confirmed that pins we were connecting to were becoming HIGH and LOW. There was a LED on Relay, it was flipping correctly. We also heard a click sound in Relay when we toggled the state. And still the behaviour of LED light was not consistent. Either it won't turn on. Or if it turned on it won't turn off. Rebooting ESP32 would fix things briefly and after couple of iterations it would be completely bricked. In logs everything was good, consistent the way it should be. But it was not. I gave up and left.

Shiv on the other hand kept trying different things. He noticed that although the PIN we connect to would in theory go HIGH and LOW. But the LOW didn't mean 0. He was still getting some measurement even when ping was LOW. He added resistors between the ESP32 pin and Relay input. It still didn't bring the LOW to zero. He read a bit more. AND he added a LED between ESP32 and Relay: Voilà.

LED was perfect. It was behaving as a switch. It takes that 3.3V and use it. Anything less, which is what we had when we put ESP32 pin to LOW, LED would eat it up and not let anything pass through. And connected on other end, Relay started blinking, clicking happily. What a nice find. Shiv again packed everything together. When we met again the next day he showed me the bulb and said, "want to try?". I was skeptical, I opened the webpage, clicked on "ON" and the bulb turned on. Off, the bulb was off. I clicked hurriedly to break it. It didn't break. It kept working, consistently, every single time.

Challenges involved in archiving a webpage


Last year as I was picking up on ideas around building a personal archival system, I put together small utility that would download and archive a webpage. As I kept thinking on the subject I realized it has very significant shortcomings:

  1. In the utility I am parsing the content of page, finding different kind of urls(img, css, js) to recursively fetch the static resources in the end I will have the archive of page. But there is more to that in how a page gets rendered. Browsers parses HTML and all the resources to finally render the page. The program we write has to be equally capable or else the archive won't be complete.
  2. Whenever I have been behind a proxy in college campuses I have noticed reCAPTCHA would pop up saying something in line that suspicious activity is noticed from my connection. With this automated archival system, how to avoid it? I have a feeling that if the system triggers the reCAPTCHA activation, for automated browsing of a page, the system will be locked out and won't have any relevant content of the page. My concern is, I don't know enough on how and when captcha triggers, how to avoid or handle them and have guaranteed access to the content of the page.
  3. Handling paywalls, or the access to limited articles in a certain time, or banners that obfuscate the content with login screen.

I am really slow in executing and experimenting around these questions. And I feel unless I start working on it, I will keep collecting these questions and add more inertia to taking a crack at the problem.

Clojure: Apply new learning to do things better


Context: I was learning Clojure while actively using it to create a CLI tool at my work. In past I have worked a lot with Python. In this post I am documenting the evolution of certain code as I learn new concepts, improve the readability and refactor the solution and get new doubts. I am also trying to relate this learning process to other content I have come across web.

Problem statement:

I have flat CSV data. Some of the rows are related based on common values(order_id, product_id, order_date).

Task: Consolidate(reduce) orders from multiple rows that have same order_id with a different restructured data.

group-by would take me almost there. But I need a different format of data, Single entry per order_id, all products belonging to it should be under items key.

order_id;order_date;firstname;surname;zipcode;city;countrycode;quantity;product_name;product_id
3489066;20200523;Guy;Threepwood;10997;Berlin;DE;2;Product 1 - black;400412441
3489066;20200523;Guy;Threepwood;10997;Berlin;DE;1;Product 2 - orange;400412445
3481021;20200526;Murray;The skull;70971;Amsterdam;NL;1;Product - blue;400412305
3481139;20200526;Haggis;MacMutton;80912;Hague;NL;5;Product 1 - black;400412441

First attempt:

After reading first few chapters from Brave and True and with lot of trial and error, I got following code to give me the results I wanted:

(defn read-csv
  ""
  [filename]
  (with-open [reader (io/reader filename)]
    (into [] (csv/read-csv reader))))

(defn get-processed-data
  "Given filename of CSV data, returns vector of consolidated maps over common order-id"
  [filename]
  ;; key for order
  (defrecord order-keys [order-id order-date first-name second-name
			 zipcode city country-code quantity product-name product-id])
  (def raw-orders (read-csv filename))
  ;; Drop header row
  (def data-after-removing-header (drop 1 raw-orders))
  ;; Split each row over ; and create vector from result
  (def order-vectors (map #(apply vector (.split (first %) ";")) data-after-removing-header))
  ;; Convert each row vector into order-map
  (def order-maps (map #(apply ->order-keys %) order-vectors))
  ;; Keys that are specific to product item.
  (def product-keys [:product-id :quantity :product-name])
  ;; Keys including name, address etc, they are specific to user and order
  (def user-keys (clojure.set/difference (set (keys (last order-maps))) (set product-keys)))
  ;; Bundle product items belonging to same order. Result is hash-map {"order-id" [{product-item}]
  (def order-items (reduce (fn [result order] (assoc result (:order-id order) (conj (get result (:order-id order) []) (select-keys order product-keys)))) {} order-maps))
  ;; Based on bundled products, create a new consolidated order vector
  (reduce (fn [result [order-id item]] (conj result (assoc (select-keys (some #(if (= (:order-id %) order-id) %) order-maps) user-keys) :items item))) [] order-items))

I am already getting anxious from this code. Firstly, number of variables are completely out of hand. Only last expression is an exception because it is returning the result. Secondly, if I tried to club some of steps, like dropping first row, creating a vector and then create hash map, it looked like:

(def order-maps (map #(apply ->order-keys (apply vector (.split (first %) ";"))) (drop 1 raw-orders)))

Code was becoming more unreadable. I tried to compensate it with the elaborate doc-strings but they aren't that helpful either.

In python, when I tried quickly to write an equivalent:

order_keys = ['order-id', 'order-date', 'first-name', 'second-name',
	      'zipcode city', 'country-code', 'quantity', 'product-name', 'product-id']
raw_orders = [dict(zip(order_keys, line.split(';'))) for line in csv_data.split('\n') if 'order' not in line]
order_dict = {}
product_keys = ['quantity', 'product-name', 'product-id']
for row in raw_orders:
    order_id = row[0]
    try:
	order_dict[order_id]['items'].append(row[-3:])
    except KeyError:
	order_dict[order_id] = {'items': row[-3:],
				'order-details': row[1:-3]}
order_dict.values()

Not the cleanest implementation, but by the end of it I will have consolidated product-items per order in a list with all other details.

And I think this is part of the problem. I was still not fully adapted to the ways of Clojure. I was forcing python's way of thinking into Clojure. It was time to refactor, learn more and clean up the code.

Threading Macros - Revisiting the problem:

I was lost, my google queries became vague, avoid creating variables in clojure, I paired with colleagues to get a second opinion. Meanwhile I thought of documenting this process in #Writing-club. As we were discussing what I would be writing, Punchagan introduced me to concept of threading macros. I was not able understand or use them right away. It took me time to warm up to their brilliance. I started refactoring above code into something like:

(ns project-clj.csv-ops
  (:require [clojure.data.csv :as csv]
	    [clojure.java.io :as io]
	    [clojure.string :as string]))

(defn remove-header
  "Checks first row. If it contains string order, drops it, otherwise returns everything.
  Returns vector of vector"
  [csv-rows]
  (if (some #(string/includes? % "order") (first csv-rows))
    (drop 1 csv-rows)
    csv-rows))

(defn read-csv
  "Given a filename, parse the content and return vector of vector"
  [filename]
  (with-open [reader (io/reader filename)]
    (remove-header (into [] (csv/read-csv reader :separator \;)))))

(defn get-items
  "Given vector of vector of order hash-maps with common id:
   [{:order-id \"3489066\" :first-name \"Guy\":quantity \"2\" :product-name \"Product 1 - black\"  :product-id \"400412441\" ... other-keys}
    {:order-id \"3489066\" :first-name \"Guy\" :quantity \"1\" :product-name \"Product 2 - orange\" :product-id \"400412445\"}]

   Returns:
   {:order-id \"3489066\"
   :items [{:product-id \"400412441\" ...}
	   {:product-id \"400412445\" ...}]}"
  [orders]
  (hash-map
     :order-id (:order-id (first orders))
     :items (vec (for [item orders]
			  (select-keys item [:product-id :quantity :product-name])))))

(defn order-items-map
  "Given Vector of hash-maps with multiple rows for same :order-id(s)
   Returns Vector of hash-maps with single entry per :order-id"
  [orders]
  (->> (vals orders)
       (map #(get-items %) ,,,)
       merge))

(defn user-details-map
  "Given Vector of hash-maps with orders
   Returns address detail per :order-id"
  [orders]
  (->> (vals orders)
       (map #(reduce merge %) ,,,)
       (map #(dissoc % :product-id :quantity :product-name))))

(defn consolidate-orders
  "Given a vector of orders consolidate orders
  Returns vector of hash-maps with :items key value pair with all products belonging to same :order-id"
  [orders]
  (->> (user-details-map orders)
       (clojure.set/join (order-items-map orders) ,,,)
       vector))

(defn format-data
  [filename]
  (defrecord order-keys [order-id order-date first-name second-name
			 zipcode city country-code quantity product-name product-id])
  (->> (read-csv filename)
       (map #(apply ->order-keys %) ,,,)
       (group-by :order-id ,,,)
       consolidate-orders ,,,)))

Doubts/Observations

Although an improvement from my first implementation, refactored code has its own set of new doubts/concerns. Punchagan :

  • How to handle error?
  • How to short circuit execution when one of function/macro fails(using some->>)?
  • Some lines are still very dense.
  • Clojure code has gotten bigger.
  • Should this threading be part of a function and I should write tests for that function?

As readings and anecdotes shared from people in above referred articles suggest, I need to read/write more, continue on the path of Brave and True and not get stuck in loop of advanced beginner.

Second Brain - Archiving: Keeping resources handy


Problem Statement

We are suffering from information overloading, specially from the content behind the walled gardens of Social Media Platforms. The interfaces are designed to keep us engaged by presenting to us latest, popular, and riveting content. It is almost impossible to revisit the source or refer to such a piece of content sometime later. There are many products that are offering exactly that, “Read it later”, moving content off of these feeds and provide a focused interface to absorb things. I think following are some crucial flaws with the model of learning and consuming the content from Social Media Platforms:

  1. Consent: Non consensual customization aka optimization of the feed.
  2. Access: There is no offline mode, content is locked in.
  3. Intent: Designed to trigger a response(like, share, comment) from a user.

In rest of the post I would like to make a case that solving for “Access” would solve the remaining two problems also. When we take out the content from the platform we have the scope of rewriting the rules of engagement.

Knowledge management

As a user I am stuck in a catch 22 situation. Traditional media channels are still catching up, for any developing story their content is outdated. Social media is non stop, buzzing 24x7. How to get just the right amount of exposure and not get burned? How to regain the right of Choice? How to return to what we have read and established it as a fact in our heads? Can we reminisce our truths that are rooted in these posts?

These feeds are infinite. They are the only source of eclectic content, voices, stories, opinions, hot takes. As long as the service is up, the content would exist. We won’t get an offline experience from these services. Memory is getting cheaper everyday, but so is Internet. Social media companies won’t bother with an offline service because they are in complete control of the online experience, they have us hooked. Most importantly, offline experience doesn’t align with their business goals.

I try to keep a local reference of links and quotes from the content I read on internet in org files. It is quite an effort to manage that and maintaining the habit. I have tried to automate the process by downloading or making a note of the links I share with other people or I come across(1, 2). I will take another shot at it and I am thinking more about the problem to narrow down the scope of the project. There are many tools, products and practices to organize the knowledge in digital format. They have varying interfaces, from annotating web pages, papers, books, storing notes, wiki pages, correlate using tags, etc. I strongly feel that there is a need for not just annotating, organizing but also archiving. Archives are essential for organizing anything. And specifically: Archive all your Social Media platforms. Get your own copy of the data: posts, pictures, videos, links. Just have the dump, that way:

  1. No Big brother watching over the shoulder when you access the content. Index it, make it searchable. Tag them, highlight them, add notes, index them also, they can be searched too.
  2. No Censorship: Even if any account you follow gets blocked, deleted, you don’t loose the content.
  3. No link rot: If link shared in post is taken down, becomes private or gets blocked, you will have your own copy of it.

This tool, the Archives, should be personal. Store locally or on your own VPS, just enable users to archive the content in first place. How we process the content is a different problem. It is related and part of the bigger problem of how we consume the content. Ecosystem of plugging the archives with existing products can and will evolve.

Features:

In P.A.R.A method, a system to organize all your digital information, they talk about Archives. It is a passive collection of all the information linked to a project. In our scenario, the archive is a collection of all the information from your social media. In that sense, I think this Archive tool should have following features:

  • Local archive of all your social media feeds. From everyone you follow, archive what they share:
    • Web-pages, blogs, articles.
    • Images.
    • Audios, podcasts.
    • Videos.
  • Complete social media timelines from all your connections is accessible, available, locally. Customize, prioritize, categorize, do what ever you would like to do. Take back the control.
  • Indexed and searchable.

Existing products/methods/projects:

The list of products is every growing. Here are a few references that I found most relevant:

Thank you punchagan for your feedback and review of the post.

Striking a balance between Clobbering and Learning


Getting stuck as an "Advanced Beginner" happens. Specially in cases when we use a new tool or language to deliver a product/project. I have noticed that I approach things with a narrow mindset, I would use the tool or language to deliver what is desired. It will have expected features but its implementation won't be ideal. The process of unlearning these habit is long and often times with a deadline I end up collecting tech debt. Recently I came across some links that talked about this phenomena:

Related Conversations on Internet

There was a big thread on HackerNews around better way to learn CSS(https://news.ycombinator.com/item?id=23868355) and I found this comment relevant to my experience:

They always assume every one learned like them, by trying stuff out all of the time, until they got something working. Then they iterate from project to project, until they sorted out the bad ideas and kept the good ones. With that approach, learning CSS would probably have taken me 10 times as long.

Sure this doesn't teach you everything or makes you a pro in a week, but I always have the feeling people just cobble around for too long and should instead take at least a few days for a more structured learning approach.

Last statement of the comment struck a chord, cloberring has its limitation and it needs to be followed up with reading of fundamental concepts from a book, manual or docs.

Another post that was shared on HackerNews talks about Expert Beginner paradox: https://daedtech.com/how-developers-stop-learning-rise-of-the-expert-beginner/

There’s nothing you can do to improve as long as you keep bowling like that. You’ve maxed out. If you want to get better, you’re going to have to learn to bowl properly. You need a different ball, a different style of throwing it, and you need to put your fingers in it like a big boy. And the worst part is that you’re going to get way worse before you get better, and it will be a good bit of time before you get back to and surpass your current average.

Practices that can help with the process of clobbering and learning:

  1. Tests: unittests gives code a structure. They set basic expectations on how the code should and should not behave. If we maintain a uniform expectation through out the code base, unittests helps maintain a certain uniformity and quality.
  2. Writing documentation: For me this is like rubber duck debugging. It gives an active feedback on what are the deliverable, supported features, limitations, and upcoming features.
  3. Pairing with colleagues over the concepts and implementation. Walking through the code and explaining it to colleagues helps me identify sections of code that make me uncomfortable. Where am I weak and where should I focus to improve.
  4. Though similar to pairing, Code Reviews have their own importance and value.

These practices won't replace the need of reading Docs or Book, but they would certainly give you good quality code and keep your tech debt in check.

Clojure, hash-map, keys, keyword


tldr; Simple strings can be used as key to a hash-map. Either use get to lookup for them. Or convert them into keyword using keyword method.

hash-map are an essential Data Structures of Clojure. They support an interesting feature of keyword that can really enhance lookup experience in Clojure hash-map.

;; Placeholder, improve it
user=> (def languages {:python "Everything is an Object."
		       :clojure "Everything is a Function."
		       :javascript "Whatever you would like it to be."})
;; To lookup in map
user=> (:python languages)
"Everything is an Object."
user=> (get languages :ruby)
nil

Syntax is easy to understand and easy to follow. So far so good. I started using it here and there. At a point I came to a situation where I had to do a lookup in a map, using a variable:

user=> (def brands {:nike "runnin shoes"
  #_=> :spalding "basketball"
  #_=> :yonex "badminton"
  #_=> :wilson "tennis racquet"
  #_=> :kookaburra "cricket ball"})

(def brand-name "yonex")

Because we have used keyword in map brands, we can't user value stored in variable brand-name directly to do a lookup in the map. I tried silly things like :str(brand-name) (results in Execution error ) or :brand-name (returns nil ). I got confused on how to do this. Almost all examples in docs were using keyword. I tried a few things and understood that we can indeed use string as key and to fetch the value use get function:

user=> (def brands {"nike" "runnin shoes"
  #_=> "spalding" "basketball"
  #_=> "yonex" "badminton"
  #_=> "wilson" "tennis racquet"
  #_=> "kookaburra" "cricket ball"})
#'user/brands
user=> (get brands brand-name)
"badminton"

While using keyword has simpler syntax, at times when I am using external APIs it is easier to work with string or lookup for a key in hash-map using variable. In python I do it all the time. Though I am not sure if using string as key is the recommended way.

Update <2020-07-08 Wed>

punch and I were discussing this post and he mentioned that in lisp we can use keyword as method to convert string into a keyword. After a quick search, TIL, indeed we can keyword a variable. The method converts string into a equivalent keyword:

user=> (def brands {"nike" "runnin shoes"
  #_=> "spalding" "basketball"
  #_=> "yonex" "badminton"
  #_=> "wilson" "tennis racquet"
  #_=> "kookaburra" "cricket ball"})
#'user/brands
user=> (keyword brand-name)
:yonex
user=> ((keyword brand-name) brands)
"badminton"

Clojure Command Line Arguments II


I wrote a small blog on parsing command line earlier this week. The only comment I got on it from punch was:

I'd like to learn from this post why command-line-args didn't work. What is it, actually, the command-line-args thingy, etc.?

Those are good questions. I didn't know answer to them. As I talk in post I wanted a simple solution to parsing args and another confusing experience brought me back to the same question, "Why args behave the way they do and What are *command-line-args*". I still don't have answer to them. In this post I am documenting two things, for my own better understanding. One around reproducing the issue of jar not able to work with *command-line-args*. Second one around limited features of sequence that are supported by args

Reproducing behaviour of jar (non)handling of *command-line-args*

We create a new project using lein:

$ lein new app cli-args
Generating a project called cli-args based on the 'app' template.
$ cd cli-args/
$ lein run
Hello, World!

We edit src/cli_args/core.clj to print args and *command-line-args*

cat <<EOF > src/cli_args/core.clj
(ns cli-args.core
  (:gen-class))

(defn -main
  "I don't do a whole lot ... yet."
  [& args]
  (println "Printing args.." args)
  (println "Printing *command-line-args*" *command-line-args*))
EOF

$ lein run optional arguments

Now we create jar using lein uberjar

$ lein uberjar
Compiling cli-args.core
Created target/uberjar/cli-args-0.1.0-SNAPSHOT.jar
Created target/uberjar/cli-args-0.1.0-SNAPSHOT-standalone.jar
$ cd target/uberjar/
$ java -jar cli-args-0.1.0-SNAPSHOT-standalone.jar testing more optional arguments
Printing args.. (testing more optional arguments)
Printing *command-line-args* nil

Clojure is able to handle *command-line-args* but java is not. That narrows down the problem and can possibly lead to explanation on why it is happening(Maybe in another post).

Sequence features supported by args

I noticed another anomaly with args. I was passing couple of arguments and I noticed that it doesn't support get method.

cat <<EOF > src/cli_args/core.clj
(ns cli-args.core
  (:gen-class))

(defn -main
  "I don't do a whole lot ... yet."
  [& args]
  (println "first argument" (first args))
  (println "second argument" (second args))
  (println "third argument" (get args 3)))
EOF

This is what I noticed as I tried different inputs:

$ lein run
first argument nil
second argument nil
third argument nil
$ lein run hello world
first argument hello
second argument world
third argument nil
$ lein run hello world 3rd argument
first argument hello
second argument world
third argument nil

The get method doesn't work. I printed type for args and it is clojure.lang.ArraySeq. For my case, I "managed" by using last and that gave me what I wanted. Still, I am running out of options and I would have to either dig deeper to understand args or fall back to using a library(tools.cli).

Command line arguments with Clojure


I am new to Clojure land and I am working on a command line tool using it. I found tools.cli library for processing command line arguments. From the documentation it looked that it has lot of features but I got overwhelmed by it. I wanted something simpler. For me, simpler meant that I would be able to get more comfortable with Clojure Syntax. My requirements were straightforward, first argument would be the name of the task and rest of the arguments would be associated to that task.

While searching clojuredocs showed: *command-line-args*

A sequence of the supplied command line arguments, or nil if none were supplied

It looked good. Syntax was easy. I could do (first *command-line-args*) to get the first argument, (second ..), would give me second, so on and so forth. I tested it with lein run arg1 arg2 and it worked as expected. Fine, done.

Later, I created a standalone jar of the tool. Strangely with jar (using lein uberjar ) as I passed arguments to java -jar cli-tool.jar arg1 arg2 my command line arguments didn't get identified. Seems *command-line-args* didn't work with java (?). I checked that main function takes & args as argument and it was a sequence. From the book CLOJURE for the BRAVE and TRUE

The term sequence here refers to a collection of elements organized in linear order

And

Lists, vectors, sets, and maps all implement the sequence abstraction.

So ideally I should be able to do (first args) and that should also work like it did for *command-line-args*. I quickly tried that, I replaced all *command-line-args* with args. lein run worked as expected and when I created a standalone jar even that was able to process my command line arguments. Cheers for the abstraction :)