NeverSawUs

Not Your Father's Javascript

How to use V8 to write cleaner JS

JavaScript has a reputation; while said reputation has vastly improved over the last year or because of the introduction of Node.js, it's still not as sterling as it ought to be.

Once you leap the various hurdles — the automatic binding of this and the way the prototype system works being the two major ones, you feel ready to sit down and write some neat JavaScript. Only there's a bit of a problem (especially coming from the golden fields of Python-land): best practices aren't easy to come by. Since there are so many ways of skinning a cat — all equally valid — often newcomers settle for the tried-and-true, but still verbose-and-fragile ways of attacking problems. This is exacerbated by the huge amount of technical writing about JavaScript on the internet — most of it is about the DOM, some of it attempts to apply to all implementations of JavaScript, and, if you're writing for a Node.js project, very little of it applies to what you are trying to do.

So, I'm going to attempt to introduce some better approaches to writing JavaScript for recent engines; with a focus on best practices for Node.js. These sections will cover an individual pain point, from low-level to higher-level problems, and may not be entirely limited to Node.js — I'll note when they can be safely applied (or how one can use them) in other environments.


The Basics

Bare minimums of style

A quick aside, first — there is, generally, a style emerging across Node.js packages; I'll keep this as quick as possible:

  • Variable declarations should be grouped at the top of their containing function.
  • Two space tabs.
  • camelCase should be used in naming methods, variables, etc.
  • CapitalCase is preferred for functions that should be invoked as object constructors (e.g., new Foo()).

One exception to the two-space tab rule is in the variable declaration portion (feel free to disagree with me here):

// this reads better
var someVar,
    someOtherVar;

// than this
var someVar,
  someOtherVar;

And yes that's super picky of me.

Functions

When declaring functions:

// either use:
var fn = function() {

};

var Dog = function() {

};

// or:
function fn() {

};

function Dog() {

};

But the important thing is please do not mix them. It screws with readability.

Packages

Packages should look like this:

<name_of_library>/
    README
    package.json
    lib/
        index.js
        <the rest of your files>
    doc/
        <you have documentation, right?>
    test/
        <you have tests, right?>
    bin/
        <if your library provides CLI programs, they should live here.>
    src/
        <if your project contains C++ code, it should live here.>

Generally, you should be able to get away with just having package.json, a README, and lib/ and test/ folders.

The file index.js should export the bulk of your public API. Under the CommonJS packaging rules, when a package directory provides an index.js, clients may import any of the files within that directory explicitly. When importing that directory, the exports object from index.js is used. For example:

// mylib/index.js
var myapi = require('mylib/api');
exports.api = myapi;

// mylib/api.js
exports.someAPIFunction = function() {
  return "hello world";
};

It's encouraged that you refer to other files within your package by their fully qualified path:

// inside the project "plate", under the file "lib/asdf.js":
// use this:
var libraries = require('plate/libraries');

// not this
var libraries = require('./libraries');

It's a lot easier to keep track of what's going on, and it encourages you to expose your library files using index.js — "We're all consenting adults here" (thanks Python).

When it comes to testing, you should pick your poison. Vows.js is particularly popular at the moment, and if that doesn't float your boat, you can always use the plain-jane assert module — var assert = require('assert') — and write your tests that way. The important thing is that they're there.

There hasn't been much in the way of consensus vis-a-vis how to create your docs, but I'd suggest using Sphinx (yes, even though it's Python) since you get free hosting by way of Read the docs. Again, even if it's just text files, it's nice for it to be there.


Iterating

What is a good for?

Iteration in JavaScript comes baked into the language in two forms:

for(var key in obj) {}
for(var i = 0; i < len; ++i) {}

These are workable approaches to iteration — they have their place, certainly, and no one should feel bad about using them. However, there are ways to write the above that feel a little less verbose, and perhaps a little less brittle. JavaScript provides a lot of methods for traversing arrays, let's take a quick look:

var arr = [1, 2, 3, 4];

// this:
for(var i = 0, len = arr.length; i < len; ++i) {
  doSomething(arr[i]);
}

// is equivalent to this.
arr.forEach(function(item, ind) {
  doSomething(item);
});

// which can be further boiled down to this:
arr.forEach(doSomething);

It should feel a little cleaner — if you want (or need) to move your loop logic someplace else where it can, perhaps, be reused, Array.prototype.forEach is awesome for that. Here's another advantage:

// so we write this in one place:
var arr = [1,2,3,4],
    out = [];
for(var i = 0, len = arr.length; i < len; ++i) {
  out.push(function() {
    console.log(i);
  });
}

// and somewhere else in our code we do this:
for(var j = 0, len = out.length; j < len; ++j) {
  out[j]();
}

// it outputs "4 4 4 4"! oh no!

This is that famously sticky problem where functions close over variable references, not values. You could fix this by doing something ugly, like:

for(var i = 0, len = arr.length; i < len; ++i) {
  out.push((function(val) {
    return function() {
      console.log(val);
    };
  })(i));
}

But that doesn't look nice, and you're introducing another function call just to grab the appropriate scope. Let's take a look at using forEach:

var arr = [1,2,3,4],
    out = [];

arr.forEach(function(item, ind) {
  out.push(function() {
    console.log(ind);
  });
});

out.forEach(function(item) {
    item();
}); 
// and it works as expected: 0 1 2 3

But while we're at it, let's use another Array builtin to really simplify this:

arr.map(function(item, ind) {
  return function() {
    console.log(ind);
  };
}).forEach(function(item) {
  item();
});

// or even better:
arr.map(function(item, ind) {
  return console.log.bind(console, ind);
}).forEach(function(item) {
  item();
});

Array.prototype.map returns a new Array with the result of the callback applied to each element. Another cool thing is that since it returns an array, we can simply chain a call to the result's forEach method to execute each item. In the second example, we boil this down even further, by using Function.prototype.bind, which is immensely powerful (and yet somehow, often ignored)!

The other iteration members of the Array family are available for perusal on the MDC Array page here. The only method not supported by V8 is reduceRight — and in general, IE does not support any of them.

The downside of these iteration functions is that they do not support any analogue of the break statement. There is one way to stop mid-array, and that is to throw an error and catch it outside of your forEach statement — not exactly the best way to go about it!

However, whenever possible, using these methods should be preferred: They're more terse, lend themselves well to reusing logic, and perhaps most importantly, custom objects can support these methods (unlike the builtin looping).


Keys to the kingdom

Tackling the other loop construct

As we've seen, the iteration methods on Array are extremely useful when available. What about the other loop construct? What's the best ways to pull the keys off of an object? The default way gets hairy quickly when you're concerned about keys set on an Object's prototype:

var keys = [];
for(var key in obj) if(obj.hasOwnProperty(key)) {
    keys.push(key);
}

Blegh. There's a better way:

var keys = Object.keys(obj);

This is functionally equivalent to the above, and makes copying objects much easier:

var toObject = {};
Object.keys(from).forEach(function(key) {
    this[key] = from[key];
}.bind(toObject));

Again, we use bind to set the value of this within our callback to toObject. It gets better; if you want only the keys that start with key_:

var toObject = {},
    fromObject = {
        some:1,
        key_value:2
    },
    re = /^key_/; 

Object.keys(fromObject).filter(re.test.bind(re)).forEach(function(key) {
    this[key] = fromObject[key];
}.bind(toObject));

Since Object.keys returns an Array object, we can use the filter method to select only the keys that match the regex. To provide our regex, we use re.test.bind(re) — that returns the RegExp.prototype.test function bound to re. Three lines and we've got a nice way to filter properties while copying!

Object.keys is not supported in IE6 or 7 — you can provide it by shoving in your own:

Object.keys = Object.keys ||
  (function(obj) {
    var out = [];
    for(var key in obj) if(obj.hasOwnProperty(key)) out.push(key);
    return out;
  });

The keys property will not show up in other for(var key in obj) since it is attached to the Object directly, and not through it's prototype.


Binding is Great

Try not to be afraid of commitment

Function.prototype.bind is wonderful — just wonderful. You may have noticed that I've been sprinkling in a bit of usage in the previous examples; that's because I want you to love it as much as I do. It has exceptional power when it comes to making callback-ridden code more readable. Here's the absolute basics of what it does:

something.bind(<context>, <curriedArg0>, <curriedArg1>, <curriedArgN...>); // -> returns a function whose `this` variable is set to `context`.

A bound function never loses its context object — you could rebind it to another object, but that would be silly — the inner context object always wins. One nice thing, though, is the ability to curry arguments into the function.

var add = function(lhs, rhs) {
    return lhs + rhs;
};

var add2 = add.bind({}, 2);     // we set the context to an empty object because we don't really care about it.
add2(5);                        // returns 7.

Yes, yes, currying is nothing new — but in the context of Node.js, where you're constantly writing functions taking callbacks and callbacks taking yet other callbacks, and generally nesting more than the top ten most endangered birds of North America, this ability is priceless.

Before we delve too deep, too greedily, like some function-binding crazed dwarves of Moria, let's address a common, simple problem that plagues most folk just getting used to JavaScript:

var Book = function() {
  this.bookURL = '/some-url/';
  this.authorURL = '/cormac-mccarthy/';
};

Book.prototype.loadLibraryData = function() {
  $.getJSON(this.bookURL, function(bookData) {
    $.getJSON(this.authorURL, function(authorData) {
      buildBookAndAuthor(bookData, authorData);
    });
  });
};

b = new Book();
b.loadLibraryData();        // oh no this doesn't work what fools we are

Losing this — it's a common problem. this inside the above callbacks is not the same as the this that Book.prototype.getLibraryData started with. There are workarounds — assigning var self = this and addressing everything as self afterwards — or alternatively executing a function immediately that takes a value of book, providing it this initially:

(function(book) {
    // now we use book the rest of the way down
})(this);

Let's rewrite it using bind.

Book.prototype.loadLibraryData = function() {
  $.getJSON(this.bookURL, function(bookData) {
    $.getJSON(this.authorURL, function(authorData) {
      buildBookAndAuthor(bookData, authorData);
    }.bind(this));
  }.bind(this));
};

Just by adding }.bind(this) at the bottom of each function, we've preserved the value of this throughout each function! Nice! So bind solves one of the major problems inherent in JavaScript — the this variable.

But we can do better! Let's see if it can do anything about the callback soup, above:

Book.prototype.loadLibraryData = function() {
  var curry = function(url, callback) {
    $.getJSON(url, callback.bind(this, arguments[2]));
  };

  $.getJSON(this.bookURL, 
    curry.bind(this, this.authorURL, buildBookAndAuthor)); 
};

We took out that nest quicker than rampant deforestation. What did we do? We created a helper function that would push received data into the target callback as it became available. That way, when we called $.getJSON for the book data, it bound buildBookAndAuthor to buildBookAndAuthor(authorData, ...), and when the second callback came back, it completed our merry journey. As a bonus, it brought the this context from the originating context (Book.prototype.getLibraryData) into our buildBookAndAuthor function. Wow! Powerful stuff, man.

You can take this further. Let's take a look at an example node.js server function:

function(request, response) {
  fs.readFile('someFile', function(err, data) {
    if(err) response.end();
    template.render('some_template', function(err, data) {
        if(err) response.end();
        response.write(data);
    }, data);
  });
};

Oof. Nesting, repeated logic (if(err) response.end()), and all we want to do, eventually, is write the template response to the response object. We can do better. Let's put on our binding hats:

var endOnError = function(continue, err, data) {
    if(err) 
      this.end();
    else
      continue(data);
};

function(request, response) {
    fs.readFile('someFile', 
        endOnError.bind(response, 
            template.render.bind(template, 'some_template', 
                endOnError.bind(response, 
                    response.write.bind(response)))));
};

The nesting is still evident, but you can read it like a flow of instructions. Read the file, end if there's an error, otherwise render the template, if there's an error here we should end, too, and finally we write the data to the response. It starts to look a little bit like LISP — which can be a good or bad thing, depending on your point of view.

The takeaway, here, is that you can wield bind to cut down on code repetition, keep track of that nasty this variable, and mow through nested callbacks like so many summer lawns. It can start to look like a different kind of soup if it's overused, so of course, be careful. It's best used in cases where you are redirecting events from one location to another:

var someEventEmitter,
    someTargetEmitter;

someEventEmitter.on('error', someTargetEmitter.bind('error'));
someEventEmitter.on('data', someTargetEmitter.bind('data'));

It's super powerful, and not at all supported on IE6 and 7. To use it on the browser side, simply add this shim:

Function.prototype.bind ||
(Function.prototype.bind = function(to) {
  var args = slice.call(arguments, 1),
      self = this;
  return function() {
    self.apply(to, args.concat(slice.call(arguments)));
  };
});

And you'll be good to go.


Standards then

may not be so standard now

So as you can see, there are a lot of bits in the JavaScript standard library that really smooth out kinks in the language.

It should also be noted that these aren't immediately obvious — you have to do a lot of delving through other's code on github to see these things in practice. The problem is that the old ways (while old and janky) still work — and there's a lot of verbiage on the internet dedicated to describing them. Finding the new, better ways is hard to do. Once you find them, though, you never want to give them up.

I'll attempt to cover other semi-obscure bits of the language in other posts down the line.