Node and Express Tips

Node and Express Tips

At Segment.io, we’ve been using node.js with the express framework for about 8 months now. We switched over from using the java play! framework and I’ve found express to be much more elegant. It’s simple, doesn’t prescribe too much, and is way less verbose than the same java code.

Over that time, I’ve discovered a few patterns and conventions which make my code significantly cleaner and easier to follow. Here they are.

Code Structure

As a general rule, make everything a module.

Node makes it so easy to bundle folders and into a single well-defined interface - there’s abolutely no reason not to do it. If you ever find yourself writing a ‘one-off’ file which contains some pieces of code that are separate from everything else in the folder, think carefully about why you shouldn’t make it a module. Chances are good that in the future you’ll want to call that code from somewhere else.

Modules

We commit our node modules to git since we depend on them anyway. Unless you’re developing something for NPM and not custom logic, you’ll want your own custom versions and specific modules. These can be symlinked from other parts of your source tree allowing you to reuse modules wherever you like.

This is handy for several reasons, one being that once a team member has checked out the repo from git they are completely ready to go. If a developer realizes they need a new version of a library, they can test and commmit it without worrying about having the other devs update their versions.

Additionally, it greatly simplifies your require statements, because node is now the one looking for modules instead of you. Instead of including modules by complicated pathing like require('../../../shared/utils'), it’s much simpler to require('utils'). Utils can be symlinked from another folder in the source tree so that you can continue to actively develop it and the changes will be reflected in the dependent modules.

Controllers

I find it easiest to have an entire module for controllers, with each file grouping a certain amount of similar controllers. For instance, our controller module looks something like


    / controllers
    |--- auth.js
    |--- registration.js
    |--- settings.js
    | ...

This is a pretty standard setup, it’s close to how both django and play encourage you to set up their respective versions of controllers, so no surprises there.

In writing actual controller methods, you want to have as little logic as possible. Controllers should primarily be used to validate and parse your input. Once they have all the required pieces they call your modules and render a template with the result.

If your modules perform the majority of the application code, you can do things like write database migrations directly from node. You can leverage all the code you’ve previously written simply by requiring your modules. Major win!

Middleware and Routing

Another question with structuring controllers “Where to put the middleware?”. I like the django and play! methods of keeping all the different route matchings in a single routes module or file, so I do the same in express. The interesting thing with express routes is that middleware for each route can be passed to the route declaration:

app.get("/", middlewareA, middlewareB, controllerMethod);

The alternative is to have middleware match routes based upon a regex. I don’t like this solution nearly as much because I find the url structure to be less explicit.

The main problem with the first approach is that it’s difficult to determine what sorts of middleware has already been run when looking at the controller code. How will we know whether we have access to the User object inside the controller function? To solve this, I like to explicitly declare middleware inside the controller as an export. Each route then directly references the middleware from the controller. Let’s use the example of an activity feed - here’s the controller:

var middleware = require("middleware"), // yeah, middleware's a module too!
  feed = require("feed");

exports.middleware = function (app) {
  middleware.session(app), middleware.auth(app), middleware.user(app);
};

exports.get = function (req, res, next) {
  var user = req.session.user;

  feed.load(user, function (err, result) {
    if (err) next(err);
    else res.render("feed.jade", { feed: result });
  });
};

I like this approach because when looking at the controller code, it’s very easy to see what is going on. The middleware is explicitly declared right above, so we have a good idea of what variables we have access to. The routes are simplified as well:

var controllers = require("controllers");

var feed = controllers.feed,
  auth = controllers.auth;

app.get("/feed", feed.middleware(app), feed.get);

Error Handling

As a final piece of middleware and routing, error handling is sometimes a difficult topic. The express documentation recommends that you have a final piece of middleware which handles errors, which is a good idea. I keep this error middleware simple: it gets passed an error and either renders a page or passes back a JSON response.

exports.error = function (err, req, res, next) {
  err.status = err.status || 500;
  err.message = err.message || "Internal server error";
  err.code = err.code || "INTERNAL_ERROR";

  if (req.xhr) {
    // If AJAX, send back json response.
    ajaxHandler(err, req, res, next);
  } else {
    // Else, give appropriate error page.
    pageHandler(err, req, res, next);
  }
};

The nice thing about this is that the middleware itself doesn’t care very much about the error it gets passed. It serves a response depending upon what sort of code and status it gets back.

Instead, I use a custom errors module which defines the errors I want to pass back. Defining custom errors is pretty simple once you get the hang of them. Better still, you can categorize the status and error codes in your modules and then pass in a custom message depending upon the exact circumstances of the error.

This has the very nice property that the application code dealing with the error decides how the error is passed back. This is the most logical place to generate the error, since the calling function has the most information about what went wrong. The error gets passed through callbacks and next() to the error middleware, where the middleware has a few methods to deal with it sanely depending on the information found in the error.

Code Style

My guiding lines for code style are pretty close to the zen of python—with a few major points in particular.

Explicit is better than implicit. Flat is better than nested. Readability counts. —The Zen of Python

In node code, explicitly defining code paths, and keeping fairly flat methods are extra important. Otherwise, it’s very easy to end up with tangled code and callbacks all over the place.

Explicit is better than implicit

Perhaps my biggest guiding principle is the explicit over implicit. I find it working in all sorts of places and I almost never regret code which specifies what it does rather than doing something clever.

  • Don’t add methods via a string
  • Use documented APIs wherever possible. Added functionality must be documented
  • For any exported function, there must be documentation
  • Define exports and variable declarations in a single place

Making methods explicit is actually fairly easy to do, given the right tools. If you want to indicate that a function is public, set it in the exports right at the function signature, private methods need only use a var statement instead. If you use sublime, the poorly named yet nicely functional dockblockr will autogenerate block comments for your methods.

/**
 * Create a new user
 * @param {String} email
 * @param {Object} options      (optional)
 *     @option {String} id
 *     @option {String} username
 *     @option {Date}   created
 * @param {Function} callback   (err, user)
 */
var create = exports.create = function (email, options, callback) {

    if (typeof options === 'function') {
        callback = options;
        options = {};
    }
    ...
}

This function seems relatively straightforward, but the fact that the options are explicitly declared really makes it easy to use without digging through code. Just looking at the signature, I can tell that this function is both publicly exported and used within the module.

Previously, I used to put exports all at the bottom of the file but I’ve since changed my mind. Having a single place where the export refers to the function means that it will never be inconsistent if the function is renamed internally.

Flat is better than nested

I find this is one has to be followed with almost militaristic enforcement. If I have a function with more than 2 nested callbacks inside it, it must be split into smaller functions. Most of the time these are just part of a simple flow control for performing a sequence of actions. If that’s the case, I use coalan’s async node module. It makes it very easy to specify a series of async calls - and you get the added bonus of using function names to specify what’s going on:

async.parallel([{

    function loadUser(callback) {
        db.users.get(user.id, callback);
    },

    function getFriends(callback) {
        db.friends.get(user.id, callback);
    }

],  function (err, results) {
    // Do something
});

Best of all, each of the async functions immediately pass their error to the last specified callback. We only have to handle errors in a single case, though if necessary the individual functions can do some error handling on their own first.

View the callbacks in node as encouragement to writing more modular code. Don’t fall into a mess of nested logic!

Readability counts

It’s hard to define what exactly makes code easy to read. Naming is possibly the most underappreciated part of it, but layout and structure are important as well.

Naming is hard to do in any language, and takes a lot of practice. In my modules I try and avoid camel case and compound names in favor of namespaced names. To see what I mean, I’ll give a toy example from our database:

exports.getUser = function (id, callback) {}; // redundant name

exports.get = function (id, callback) {}; // clearer name

Our old call to get a user was db.user.getUser(id, function), but we already know we are accessing the user module! We can get rid of the last user bit and instead make a call to db.user.get. This seems like a small change - but it’s one which makes the code feel much easier to read. If possible, add modules and namespaces instead of compound function names.

Other things I like to do:

  • align objects and var declarations
  • early returns to avoid large if…else… blocks
  • combine var declarations for ‘like variables’

Take it away

It’s pretty easy to see that node and express are really flexible and can lend themselves to a wide variety of programming styles. My approach tends to be more rooted in the MVC way of thinking, but there’s no reason that express can’t be adapted to many different patterns.

I’d love to hear a bit about your own tips and best practices as well: you can hit me up at @calvinfo, or shoot me an email at calvin at this domain.

Want to see what else I’m working on? Check out segment.io for some analytics-goodness.

Serbian Translation