MongoDB connection pooling for Express applications

May 22, 2017 0 Comments

MongoDB connection pooling for Express applications

 

 

Express is the most popular Node.js web framework and the fourth most depended-upon package on the NPM registry. As a result of its popularity, there are an abundance of tutorials and examples for getting started with new Express apps - we too have created a "getting started" MEAN stack tutorial for the Heroku DevCenter. However, "getting started" apps generally don't show you how to handle the more serious parts of real-world systems.

In this post we are going to demonstrate how a production Node.js application might connect to multiple MongoDB databases. Our example will demonstrate how to create a single connection pool to a MongoDB deployment and how to structure an Express app to reuse that pool across multiple modules.

You can find the code for this example here:
https://github.com/mlab/express-mongodb-setup

What is a connection pool and why is it important?

A connection pool is a cache of authenticated database connections maintained by your driver, from which your application can borrow connections when it needs to run database operations. After each operation is complete, the connection is kept alive and returned to the pool. When properly used, connection pools allow you to minimize the frequency of new connections and the number of open connections to your database.

It's important to minimize the frequency and number of new connections to the database because creating and authenticating connections to the database is expensive - these processes require both CPU time and memory. When we reuse connections by using a connection pool we avoid this resource cost to the server, and also benefit from lower latency since the app doesn't need to wait for an authentication to finish before a query can be sent.

The default pool size in the Node driver is 5. If your app typically needs more than 5 concurrent connections to the database, you can increase the value of the poolSize setting.

Common mistake in connection management for Express apps

A common mistake developers make when connecting to the database is to call MongoClient.connect() in every route handler to get a database connection.

For example, code like this:

 app.route('/foo', function(req, res) { MongoClient.connect('url', function(err, db) { ... }) }) app.route('/bar', function(req, res) { MongoClient.connect('url', function(err, db) { ... }) }) 

will create 5 connections to MongoDB for every request, including concurrent requests to the same route. Connection counts can quickly skyrocket with this approach.

Example for connection pooling in an Express app

For our example we'll implement database connection pooling for the recommended Express project structure (as outlined by the Express app generator).

Our example has the following structure:

 |-- README.md |-- app.js |-- dbs |-- index.js |-- routes |-- index.js 

The app.js file is the main file which starts the app. Inside dbs/index.js is our database connection logic, and inside routes/index.js is our request routing logic.

We use the index.js file naming convention because it allows parent modules the ability to import the module by its directory name, the index.js file is implicitly found. For example, later you'll see we require the "dbs" module with require('./dbs'), rather than require('./dbs/index.js').

Set up database connections

We'll first start with creating and managing database connections in the dbs/index.js file.

 var async = require('async'); var MongoClient = require('mongodb').MongoClient; // Note: A production application should not expose database credentials in plain text. // For strategies on handling credentials, visit 12factor: https://12factor.net/config. var PROD_URI = "mongodb://<dbuser>:<dbpassword>@<host1>:<port1>,<host2>:<port2>/<dbname>?replicaSet=<replicaSetName>"; var MKTG_URI = "mongodb://<dbuser>:<dbpassword>@<host1>:<port1>,<host2>:<port2>/<dbname>?replicaSet=<replicaSetName>"; var databases = { production: async.apply(MongoClient.connect, PROD_URI), marketing: async.apply(MongoClient.connect, MKTG_URI) }; module.exports = function (cb) { async.parallel(databases, cb); }; 

To use the "dbs" module, you might write some code like this:

 var initDatabases = require('./dbs); initDatabases(function(err, dbs) {....}); 

The "initDatabases" function call initializes database connections and makes the connections accessible via the dbs variable. You could also create a module for other application startup tasks such as connecting to third party APIs or loading data or configuration.

Note: Following 12-factor methodology, rather than storing connection strings in code you should use config variables.

Reuse database connections in routes (or other app) files

With the logic written for establishing connections at app startup, we now need to structure the routing logic so their handlers can use those existing connections.

We need a strategy in code to reuse database connections in the "routes" module (routes/index.js file). We can't just call require('./dbs') again in this file, as that would create a second connection pool. We also need a reference to the Express app created in app.js, but we can't use require('../app.js') because app.js requires the "routes" module and circular dependencies should be avoided wherever possible.

Instead, our module will expose a function which takes the Express app and databases as parameters. The parent module that imports this file can then be responsible for finding those values and making them accessible to the module. The technical name for this pattern is "Dependency Injection", as a parent module injects the values that another module depends on. Let's take a look at how it works in practice:

 module.exports = function(app, dbs) { app.get('/production', function(req, res) { dbs.production.collection('test').find({}).toArray(function (err, docs) { if (err) { console.log(err); res.error(err); } else { res.json(docs); } }); }); app.get('/marketing', function(req, res) { dbs.marketing.collection('test').find({}).toArray(function (err, docs) { if (err) { console.log(err); res.error(err); } else { res.json(docs); } }); }); return app; }; 

The route definitions are established in exactly the same way as other Express apps, but the app and the database references are injected by the parent module when the module is loaded.

Tie the "routes" and "dbs" modules together in the main app module

We can understand how the "routes" and "dbs" modules are tied together by looking at our app.js code. Our app code:

  1. Initializes an Express app
  2. Uses the "dbs" module to create connection pools to our databases
  3. Passes both "dbs" and "app" modules into the "routes" module
  4. Starts the app on port 3000 only after the "routes" module configures the app's route handlers
 var express = require('express'); var app = express(); var initializeDatabases = require('./dbs'); var routes = require('./routes'); initializeDatabases(function(err, dbs) { if (err) { console.error('Failed to make all database connections!'); console.error(err); process.exit(1); } // Initialize the application once database connections are ready. routes(app, dbs); app.listen(3000, function() { console.log('Listening on port 3000'); }); }); 

Using this structure, our app will maintain and reuse a single connection pool to each database. This reduces pressure on the database caused by creating and authenticating new connections or by maintaining multiple unused connection pools. As an added performance and latency benefit, reusing existing connection pools allows the application to quickly retrieve database results without having to go through the connection creation and authentication process each time.

If you have questions or thoughts please email our team at support@mlab.com for help!


Tag cloud