Can you avoid unleashing Zalgo in NodeJS?

Yes, you probably can but for that, you need to know what does it even mean by unleashing Zalgo.

By the way, the term Zalgo comes from a meme. Basically Zalgo happens to be an Internet legend about an ominous entity believed to cause insanity, death and destruction of the world. Zalgo is often associated with scrambled text on webpages and photos of people whose eyes and mouth have been covered in black.

The concept of Zalgo in NodeJS was first talked about by Isaac Z. Schlueter (one of the core developers on the NodeJS project) in one of his blog posts. Isaac’s post was in turn inspired by a post about callbacks on Havoc’s blog.

Developers can often make mistakes while building large applications. However, some mistakes can be highly troublesome to find and fix. Unleashing Zalgo is one of those type of mistakes.

In this post, we will learn how to identify a Zalgo-like situation in our code and how we can avoid falling into this trap.

1 – Synchronous or Asynchronous Function = Unpredictable

If the Reactor Pattern is the engine of NodeJS, the callback pattern is the fuel on which it runs.

The callback pattern makes it actually possible for JavaScript to handle concurrency despite being single-threaded (technically).

However, callbacks in NodeJS can be asynchronous as well as synchronous. In the post about NodeJS Callback Pattern, we saw how the order of instructions can change radically depending on whether a function is synchronous or asynchronous.

As expected, this has significant consequences for the correctness and efficiency of our programs. Of course, it is not that hard to handle a function that clearly states whether it is synchronous or asynchronous.

The problem arises when a function behaves inconsistently.

In other words, what if a function can run synchronously in certain conditions and asynchronously in some other conditions?

Basically, such a function is an unpredictable function.

Unpredictable functions and the APIs they expose can lead to several problems that are hard to detect and reproduce. In other words, by building such functions in your own code, you are unleashing Zalgo.

2 – Unleashing Zalgo in NodeJS

Let us understand the concept of unleashing Zalgo in NodeJS using the example of an unpredictable function.

Check out the below code:

let cache = {};

function getStringLength(text, callback) {
    if (cache[text]) {
        callback(cache[text])
    } else {
        setTimeout(function() {
            cache[text] = text.length;
            callback(text.length)
        }, 1000)
    }
}

The getStringLength() function is inherently evil. Though it is used to simply calculate the length of a string, it has two faces.

If the string and its length is available in the cache, the function behaves synchronously by returning the data from the cache.

Otherwise, it calculates the length of the string and stores the result in the cache before triggering the callback. However, all of this is done asynchronously using a setTimeout().

Do note that the use of setTimeout() is to force an asynchronous behaviour. You can replace it with any other asynchronous activity such as reading a file or making an API call. The idea is to demonstrate that a function can have different behaviour for different situations.

“But how does it unleash Zalgo?” you may ask.

Demo

Let us write some more logic to actually use this unpredictable function. See the completed code below:

function sleep(milliseconds) {  
      return new Promise(resolve => setTimeout(resolve, milliseconds));  
}  

let cache = {};

function getStringLength(text, callback) {
  
    if (cache[text]) {
        callback(cache[text])
    } else {
        setTimeout(function() {
            cache[text] = text.length;
            callback(text.length)
        }, 1000)
    }
}

function determineStringLength(text) {
    let listeners = []
    getStringLength(text, function(value) {
        listeners.forEach(function(listener) {
            listener(value)
        })
    })
    return {
        onDataReady: function(listener) {
            listeners.push(listener)
        }
    }
}

async function testLogic() {
    let text1 = determineStringLength("hello");
    text1.onDataReady(function(data) {
        console.log("Text1 Length of string: " + data)
    })
    
    await sleep(2000); 
    
    let text2 = determineStringLength("hello");
    text2.onDataReady(function(data) {
        console.log("Text2 Length of string: " + data)
    })
}

testLogic();

Pay special attention to the function determineStringLength(). It is a sort of wrapper around the getStringLength() function.

Basically, the determineStringLength() function creates a new object that acts as a notifier for the string length calculation. When the string length is determined by the getStringLength() function, the listener functions registered in determineStringLength() are invoked.

To test this concept, we have the testLogic() function at the end. The test function basically calls determineStringLength() function twice for the same string “hello”. Between the two calls, we pause the execution for 2 seconds using the sleep() function to introduce some time lag between the two calls.

Running the program provides the below result:

Text1 Length of string: 5

As you can see, the callback for the second operation was never invoked.

  • For text1, the getStringLength() function behaves asynchronously since the data is not available in the cache. Therefore, we were able to register our listener function properly and hence, the output was printed.
  • Next, we have text2 that is created in an event loop cycle that already has the data in cache. This time getStringLength() behaves synchronously. Hence, the callback that is passed to getStringLength() invokes immediately. This in turn invokes all the registered listeners synchronously. However, registration of the listener happens later and hence it is never invoked.

As you can see, the root of this problem is the unpredictable nature of getStringLength() function. Instead of providing consistency, it increases the unpredictability of our program.

It goes without saying that such bugs can become extremely complicated to identify and reproduce in a real application. By all means, this can cause nasty defects and unleash Zalgo in our NodeJS application.

3 – Avoid Zalgo in NodeJS using Deferred Execution

So, how do we actually avoid Zalgo in NodeJS?

It’s actually quite simple. We make sure our functions behave consistently in terms of synchronous vs asynchronous behaviour.

In our somewhat contrived example, we can fix the issue by making the getStringLength() function purely asynchronous for all scenarios.

See below:

function getStringLength(text, callback) {
  
    if (cache[text]) {
        process.nextTick(function() {
         callback(cache[text]);
        });
        //callback(cache[text])
    } else {
        setTimeout(function() {
            cache[text] = text.length;
            callback(text.length)
        }, 1000)
    }
}

Instead of directly triggering the callback, we wrap it inside the process.nextTick(). This defers the execution of the function until the next event loop phase.

If you are confused by the execution of process.nextTick(), I recommend going through this comprehensive post on event loop phases in NodeJS. The post will make things absolutely clear about when a certain callback is executed by the event loop.

Conclusion

Many times, subtle reasons can cause nasty bugs. Unleashing Zalgo is one of them and hence, the interesting name given to this situation.

As I mentioned earlier, the term was used in the context of NodeJS by Isaac Z. Schlueter which was also inspired by a post on Havoc’s Blog. Below are the links:

You can check out those posts to get more background. I hope the example in this post was useful in understanding the issue on a more practical level.

Do share your views in the comments section below.

Categories: BlogNodeJS

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *