Using sleepTimeout in JavaScript

tldr: ALWAYS take a brief look at official developer documentation of functions.

I was trying to rush a release over weekend and I had requirement where I was to make repeated API calls to track progress of status of task. Without that any new user would be seeing a "dead page" without any info on what is going on and how he/she should proceed. Pseudo code would be something like:

function get_task_status(task_id) {
  $.get("/get_task_status/", {'task_id':task_id})
    .done( function(data) {
      // Update status div
      // Wait for x seconds and repeat this function

As usual I hurried to Google search the template/pointer code. StackOverflow didn't disappoint and I landed up with this discussion. It had decent insight on using callback function with setTimeout and I cooked up my own version of it:

function get_task_status(task_id) {
  $.get("/get_task_status/", {'task_id':task_id})
    .done( function(data) {
      if(data['task'] === 'PROGRESS') {
	Materialize.toast(data['info'], 1000);
	setTimeout(get_task_status(task_id), 2000);

Looks innocent right? Well that's what got me stumped for almost 3-4 hours. I tried this and my javascript happily ignored setTimeout and delay in seconds and kept making continues GET requests. I tried some variants of above code but nothing worked. Eventually I came across this post on SO, tried the code and it worked! I was convinced that there was some issue with older version handling of setTimeout and 2016 update is what I needed.

Today as I sat to put together a brief note of this experience I was testing setTimeout code on node, browser console, inside HTML template and somehow each time delay was working just fine:

function get_task_status() {
  setTimeout(get_task_status, 2000);
> function get_task_status(task_id) {
...   console.log(Date());
...   // Recursive call to this function itself after 2 seconds of delay
...   setTimeout(get_task_status, 2000);
... }
> get_task_status('something');
Tue Oct 25 2016 15:55:59 GMT+0530 (IST)
> Tue Oct 25 2016 15:56:01 GMT+0530 (IST)
Tue Oct 25 2016 15:56:03 GMT+0530 (IST)
Tue Oct 25 2016 15:56:05 GMT+0530 (IST)
Tue Oct 25 2016 15:56:07 GMT+0530 (IST)
Tue Oct 25 2016 15:56:09 GMT+0530 (IST)
(To exit, press ^C again or type .exit)

Again, this bummed me, I thought I had "established" that setTimeout was broken and promise is what I should be looking at and get better understanding of. While trying to work it out and understand what is wrong as I checked MDN documentation of the function and I finally realized my real bug. Syntax of function is var timeoutID = window.setTimeout(func[, delay, param1, param2, ...]);

And this is what I was doing: setTimeout(get_task_status(task_id), 2000);

Notice in syntax params are after the delay argument while I just used them directly and this was the small gotcha. I was talking to Syed ji about this experience and he pointed to You don't know series for better understanding of javascript concepts and nuances. I learned my lesson, properly RTFM and as for promise, I will return to learn more about it later, at the moment my code is working.


Yesterday punchagan introduced me with PEBKAC - Problem Exist Between Keyboard And Chair, it was a Things I learned(TIL). I had experienced this before many times, only I didn't know that there was such a term.

Starting in April, at TaxSpanner I was given task to integrate ITD Webservices APIs for return filing and other features with our existing stack. The procedure included quite a few alien things to me. Sample codes provided by ITD were in Java, they were using something called SPRING framework, our requests had to be routed via specific proxy approved from ITD and furthermore we physically needed an USB DSC key registered with ITD to encrypt the communication.

As I was trying to get first successful run of API working from my system I was particularly stuck with accessing DSC key from my java code. It needed drivers available here and java security(/etc/java-7-openjdk/security/ file to be edited properly to use correct drivers. After doing these things first thing I tried was to list certificates on the USB token using keytools. And on first run, it worked fine. I was ecstatic, one fewer unknown from the pile of unknowns, right. Wrong, as soon as I tried to run java programs using DSC it threw up lines and lines of error which went something like:

org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Unable to locate Spring NamespaceHandler for XML schema namespace []
Offending resource: class path resource [ClientConfig.xml]

	at org.springframework.beans.factory.parsing.FailFastProblemReporter.error(
	at org.springframework.beans.factory.parsing.ReaderContext.error(
	at org.springframework.beans.factory.parsing.ReaderContext.error(
	at org.springframework.beans.factory.xml.BeanDefinitionParserDelegate.error(
	at org.springframework.beans.factory.xml.BeanDefinitionParserDelegate.parseCustomElement(
	at org.springframework.beans.factory.xml.BeanDefinitionParserDelegate.parseCustomElement(
	at itd_webs.core.main(Unknown Source)
2016-05-11 12:03:48,114 [main] WARN  org.apache.cxf.bus.spring.SpringBusFactory -  Failed to create application context.
org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Unable to locate Spring NamespaceHandler for XML schema namespace []
Offending resource: class path resource [ClientConfig.xml]

Getting configs in place so that SPRING framework can load proper credentials from DSC was one another task where inputs from Nandeep proved very crucial. Thankfully we had one more system where this setup with the DSC worked. So this was clear that the particular error related to DSC recognition was just on my system. After lot of head scratching, comparing two systems to identify if something is amiss, trying strace, nothing helped. After scrolling through lot of java related stackoverflow conversations I was playing around with keytools and jdb. As I tried JDB, I noticed DSC blinking and then I thought that the maybe default java was using different configs, I checked my usr/lib/jvm and indeed there were 4 different version of java. I checked `java -version` and it pointed to java version "1.8.065" so instead I tried to compile and run command using /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java and USB blinked happily ever after. While we kept developing and making system stable on computer where things were working, to get it working on my system it almost took two weeks to narrow down to exact issue. And now I have a name for all the time spent, PEBKAC. Thank you punchagan.

Using Hekad to parse logs for relevant parts

We at Taxspanner have been looking at different options for analytics pipeline and setting up things to capture relevant information. Vivek had noticed Hekad and he was wondering if we could use syslogs which are already being generated by the app. Idea was to look for specific log in a format which could contain information like App name, Model, UUID and operation being done.

We followed basic setup guide to get a feel of it and it was processing nginx logs in tunes of millions very quickly. Apart from official documentation, this post talks about how to setup a quick filter around Hekad processing pipeline. We experimented with a client-server setup where client running on app server can tail the django log file, filter relevant log message and push it to server hekad instance aggregating logs from all app servers.

This was the client's side hekad config toml file:

maxprocs = 1
base_dir = "."
share_dir = "hekad-location/share/heka/"

# Input is django log file
type = "LogstreamerInput"
splitter = "TokenSplitter"
log_directory = "logs/"
file_match = 'django\.log'

# Decoder to parse logs and extracting relevant log
type = "SandboxFilter"
message_matcher = "Logger == 'django_logs'"
filename = "lua_decoders/django_logs.lua"

# Encoder for output streams

# We channel output generated from DjangoLogDecoder to a certain UDP port
message_matcher = "Logger == 'DjangoLogDecoder'"
address = ":34567"
encoder = "PayloadEncoder"

The Lua script to filter relevant log pretty small:

local string = require "string"
local table = require "table"

-- This structure could be used in better way
local msg = {
Timestamp   = nil,
Type        = msg_type,
Payload     = nil,
Fields      = nil

function process_message ()
    local log = read_message("Payload")
    if log == nil then
      return 0
    local log_blocks = {}
    for i in string.gmatch(log, "%S+") do
      table.insert(log_blocks, i)
    if table.getn(log_blocks) >= 4 then
      if log_blocks[3] == "CRITICAL" then
	msg.Payload = log
    return 0

With client instance in place now we get out listener config sorted out.

maxprocs = 4
base_dir = "."
share_dir = "hekad-location/share/heka/"


# Input listening to port 
type = "UdpInput"
address = ":34567"

# Output channels message received and just prints them
message_matcher = "Logger == 'app_logs'"
encoder = "PayloadEncoder"

And that's it, this will have a basic hekad based pipeline in place which can simply pick information from django logs.

Issues with Indexing while using Cassandra.

We have a single machine cassandra setup on which we are trying different things for analytics. One of Column family we have goes with this discription:

CREATE TABLE playground.event_user_table (
    event_date date,
    event_time timestamp,
    author text,
    content_id text,
    content_model text,
    event_id text,
    event_type text,
    PRIMARY KEY (event_date, event_time)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': ''}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE INDEX author ON playground.event_user_table (author);

With this table, we populated data in it for different apps/models. Now when we query system with something like:

cqlsh:playground> select * from event_user_table where event_date = '2013-06-02' ;
 event_date | event_time               | author                                 | content_id                                      | content_model | event_id | event_type
 2013-06-02 | 2013-06-02 00:00:00+0000 |        |        |           A   |          |  submitted
 2013-06-02 | 2013-06-02 01:28:13+0000 |      |                                      1000910424 |           B   |          |     closed
 2013-06-02 | 2013-06-02 01:59:31+0000 |         |         |           A   |          |    created
 2013-06-02 | 2013-06-02 02:00:44+0000 |            |            |           A   |          |    created
 2013-06-02 | 2013-06-02 02:02:16+0000 |       |       |           A   |          |    created

Result looks good and as expected. Now I query system on the secondary index of author and I get empty or partial results:

cqlsh:playground> select * from event_user_table where author = '' ;
 event_date | event_time | author | content_id | content_model | event_id | event_type

(0 rows)

cqlsh:playground> select * from event_user_table where author = '' ;
 event_date | event_time               | author                             | content_id | content_model | event_id | event_type
 2014-01-18 | 2014-01-18 09:01:52+0000 | | 1001068325 |           SRF |          |     closed

(1 rows)

And I have tried this combinations of PRIMARY KEY too ((event_date, event_time), author) but with same results. There are known issues with secondary indexes and scaling1 but it affects single node systems too? I am not sure about it. Time to confirm things.

Update1 <2016-02-10 Wed 15:45>: As mentioned here2, Cassandra has "'lazy' updating to secondary indexes. When you change an indexed value, you need to remove the old value from the index." Could that be the reason?

Extending Zulip to provide Chat-With-Us_Helpdesk interface.

Here is the github repo:

We at TaxSpanner had been using olark to help out our customers over chat interface. While there interface is very mature and gives all the features they mention, we were not able to manage things in terms of how customer was reaching out us via different mediums(email, chat, phone). As zulip got released and we were giving it a try for our internal team communication(both tech+sales team) idea was floated that could we extend the same interface to our customers too?

We set the target to get a prototype in place which could replace olark and if during trails feature requests are reasonable and manageable we would take a call. So idea was to expose a limited view to customers while having full feature interface for our Support Team. We created two stripped down and simple html templates which would be exposed to customers. Got a view in place in lines of home view of zulip with additional logic to route customers to sales team. Notifications for offline and online sales team, additional small checks in main zulip views to make sure basic views are exposed just to internal team.

Like function get_status_dict in zerver/libs/ there is a hook for MIT users,

# Return no status info for regular users
if requesting_user_profile.realm.domain != settings.ADMIN_DOMAIN:
    return defaultdict(dict)

and additional check to home in views:

from zerver.forms import has_valid_realm
if not has_valid_realm(
    from django.contrib.auth import logout
    # making sure user is logged out from session.
    return redirect('/support')

For adding this interface on landing page we added following HTML

<div id="zulip" style="bottom: 0px; right: 0px; position: fixed; z-index: 9999;" >
    <div style="width: 200px; height: 32px;" id="zulip-chat">
	    <span class="icon--speech-bubble"></span>
	    Chat with Us!
      <iframe id="zulip-iframe" height="350px" width="450px" style="border:1px solid gray; display: none;" src=""></iframe>

and this java-script code to handle user clicks

  $('#zulip-chat').on('click', function(e) {
  if ($('#zulip-iframe').is(':hidden')) {
    $('#zulip-iframe').attr('src', 'https://url-to-zulip-instance');

At the moment we auto-login customer after asking their email-id. For security purpose, we create a new private stream for the customer and unsubscribe it form previous existing streams. Ideally there should be a way(OAuth or server-to-server authentication) to make sure user is logged in on main site and then enable previous history of chats they had done.

I am not exactly sure about this (ab)Use-case of Zulip and quite possibly there could be something which I am overlooking/missing. But zulip has a really strong chat interface around which if we can integrate our APIs, it will give us a lot of control. Adding bots, notification, some simple intelligence and developing things on top of it could enable and extend the existing web application in lot of ways.

Setting up Zulip with Docker

UPDATE <2015-12-20 Sun>: Zulip repo itself now has docker support which is much cleaner than what I had done.

UPDATE <2015-11-15 Sun>: webpack with this config is not working neatly so while running docker image checking out revision ae04744

After attending RC, punchagan had mentioned about Zulip quite a few times on google-talk based bot which we use for communication. There were few discussions among the group about trying something else like Slack or other tool which could cater to group spread across different channels(watsapp, hangout, legacy google-talk) along with rich features like sharing photos, docs etc. As Zulip got released sometime back we were excited to give it a try.

It got released on Friday so we thought of giving it a shot over the weekend and try to get something with which we can play around. As we were looking at initial documentation it involved lot of sudos for setup. While we had access to server we didn't want to experiment things on system level, so we thought of trying Docker. With a sample ubuntu image we quickly got all dependencies installed and were able to run file.

To get a "standard" setup we thought we will try to run different services on different containers like instructed here. But soon we ran into issue of manually editing code in zulip setup to initialize different services(DB, rabbitmq etc). And I think I was not doing it the right way, I was installing most of packages in all containers. So we reverted to having a Dockerfile with one container having complete setup.

Now there is PR related to Docker setup for Zulip too. But being a starter with docker I wasn't quite sure of all the steps being done there. We ended up with this setup, Dockerfile, a shell script and two custom scripts(though in subprocess command of there is check to make sure process_fts_updates work without password but it wasn't working from shell script and second file is just commenting out part of where for registration there are certain checks.)

Build the docker instance:

$ sudo docker build -t zulip-instance .

And finally to get the instance up:

$ sudo docker run --name zulip -i -t -p 9991:9991 zulip-instance

There are still issues like, there should be some bots present initially to create users and realms. And while trying to understand management commands to do the same(creating a bot user) seems to have circular dependency with the step of "notify_created_user". Or maybe I have overlooked something during this step. Also the settings are development, settings for production things would be different.

FOSSEE Project

I got mail a from Prabhu(PR) few days back about extension of FOSSEE project and that they are looking for people for different roles. I had joined the project fairly early(July 2010) when it was getting started and we were trying different ways to show power of Python as alternative to popular proprietary tools used in academia, in terms of writing code, calculations, speed, everything.

We did workshops in different colleges, recorded screen-casts, events, documentations and even a full fledged course: SEES, around Python and various tools which could improve skills students to handle their courses and codes with good practices.

Though this sounds like a lot of documentation work, and not exactly like a "developer profile" but for starters to reach to the level that you can yourself understand these concepts, algorithms, good practices I think is the biggest thing at offer. And this time, as PR mentioned in mail, they are also looking for developers to contribute, develop and push things with some mainstream Open Source projects. Personally, I met wonderful people there, Prabhu, Asokan, punchagan, madhu, vattam, dusual and many more from active and thriving community. There is always good scope and room to grow.

When I had joined the project initially, I was excited about probably contributing, able to sit through classes etc. The mission of project is high minded and this kind of adaptation and acceptance is never overnight and always gradual. I remember being frustrated about lack of enthusiasm from the participants and not seeing results. But that's the thing, you get to take a chance, try things, see if they work, if not, adapt, try more, but try.

Note: I worked with the team very initially and only for a year so I am not aware of how currently things work at fossee, I think contacting them directly for latest update would be best.