Cloud Foundry Blog

Scaling Real-time Apps on Cloud Foundry Using Node.js and RabbitMQ

In the previous blog Scaling Real-time Apps on Cloud Foundry Using Node.js and Redis, we used Redis as a ‘session store’ and also as a ‘pub-sub’ service for chat messages. But in many enterprise grade real-time apps, you may want to use RabbitMQ instead of Redis to do pub-sub because of the reliability and features that comes out-of-the-box in RabbitMQ. This is especially true for financial or Bank apps like Stock Quote apps where it is critical to protect and deliver each-and-every message AND do it as quickly as possible.

So, in this blog, we will start from Scaling Real-time Apps on Cloud Foundry Using Node.js and Redis and simply replace Redis with RabbitMQ pubsub.

The app architecture (before):

The app architecture w/ RabbitMQ (after):


Introduction to RabbitMQ

The Node.js community may not be familiar with RabbitMQ. So here are some of the high-level intro of RabbitMQ.

RabbitMQ is a message broker. It simply accepts messages from one or more endpoints “Producers” and sends it to one or more endpoints “Consumers”.

RabbitMQ is more sophisticated and flexible than just that. Depending on the configuration, it can also figure out what needs to be done when a consumer crashes(store and re-deliver message), when consumer is slow (queue messages), when there are multiple consumers (distribute work load), or even when RabbitMQ itself crashes (durable). For more please see: RabbitMQ tutorials.

RabbitMQ is also very fast & efficient. It implements Advanced Message Queuing Protocol “AMQP” that was built by and for Wall Street firms like J.P. Morgan Chase, Goldman Sachs, etc. for trading stocks and related activities. RabbitMQ is an Erlang (also well-known for concurrency & speed) implementation of that protocol.

For more please go through RabbitMQ’s website.


Fundamental pieces of RabbitMQ

RabbitMQ has 4 pieces.

  1. Producer (“P”) – Sends messages to an exchange along with “Routing key” indicating how to route the message.
  2. Exchange (“X”) – Receives message and Routing key from Producers and figures out what to do with the message.
  3. Queues(“Q”) – A temporary place where the messages are stored based on Queue’s “binding key” until a consumer is ready to receive the message. Note: While a Queue physically resides inside RabbitMQ, a consumer (“C”) is the one that actually creates it by providing a “Binding Key”.
  4. Consumer(“C”) – Subscribes to a Queue to receive messages.

Routing Key, Binding Key and types of Exchanges

To allow various work-flows like pub-sub, work queues, topics, RPC etc., RabbitMQ allows us to independently configure the type of the Exchange, Routing Key and Binding Key.

Routing Key:

A string/constraint from Producer instructing Exchange how to route the message. A Routing key looks like: “logs”, “errors.logs”, “warnings.logs” “tweets” etc.

Binding Key:

Another string/constraint added by a Consumer to a queue to which it is binding/listening to. A Binding key looks like: “logs”, “*.logs”, “#.logs” etc.

Note: In RabbitMQ, Binding keys can have “patterns” (but not Routing keys).

Types of Exchange:

Exchanges can be of 4 types:

  1. Direct – Sends messages from producer to consumer if Routing Key and Binding key match exactly.
  2. Fanout – Sends any message from a producer to ALL consumers (i.e ignores both routing key & binding key)
  3. Topic – Sends a message from producer to consumer based on pattern-matching.
  4. Headers – If more complicated routing is required beyond simple Routing key string, you can use headers exchange.

In RabbitMQ the combination of the type of Exchange, Routing Key and Binding Key make it behave completely differently. For example: A fanout Exchange ignores Routing Key and Binding Key and sends messages to all queues. A Topic Exchange sends a copy of a message to zero, one or more consumers based on RabbitMQ patterns (#, *).

Going into more details is beyond the scope of this blog, but here is another good blog that goes into more details: AMQP 0-9-1 Model Explained


Using RabbitMQ to do pub-sub in our Node.js chat app.

Now that we know some of the basics of RabbitMQ, and all the 4 pieces, let’s see how to actually use it in our Chat app.

Chat App:

Connecting to RabbitMQ and creating an Exchange

For our chat application, we will create a fanout exchange called chatExchange. And we will be using node-amqp module to talk to RabbitMQ service.

//Connect to RabbitMQ and get reference to the connection.
var rabbitConn = amqp.createConnection({});

//Create an exchange with a name 'chatExchange' and of type 'fanout'
var chatExchange;
rabbitConn.on('ready', function () {
    chatExchange = rabbitConn.exchange('chatExchange', {'type': 'fanout'});
});

Creating Producers (So Users can send chat messages)

In our chat app, users are both producers(i.e. sends chat messages to others) and also consumers (i.e. receives messages from others). Let’s focus on users being ‘producers’.

When a user sends a chat message, publish it to chatExchange w/o a Routing Key (Routing Key doesn’t matter because chatExchange is a ‘fanout’).

/**
     * When a user sends a chat message, publish it to chatExchange w/o a Routing Key (Routing Key doesn't matter
     * because chatExchange is a 'fanout').
     *
     * Notice that we are getting user's name from session.
     */
    socket.on('chat', function (data) {
        var msg = JSON.parse(data);
        var reply = {action: 'message', user: session.user, msg: msg.msg };
        chatExchange.publish('', reply);
    });

Similarly, when a user joins, publish it to chatExchange w/o Routing key.

/**
     * When a user joins, publish it to chatExchange w/o Routing key (Routing doesn't matter
     * because chatExchange is a 'fanout').
     *
     * Note: that we are getting user's name from session.
     */
    socket.on('join', function () {
        var reply = {action: 'control', user: session.user, msg: ' joined the channel' };
        chatExchange.publish('', reply);
    });

Creating Consumers (So Users can receive chat messages)

Creating consumers involves 3 steps:

  1. Create a queue with some options.
  2. Bind queue to exchange using some “Binding Key”
  3. Create a subscriber (usually a callback function) to actually obtain messages sent to the queue.

For our chat app,

  1. Let’s create a queue w/o any name. This forces RabbitMQ to create new queue for every socket.io connection w/ a new random queue name. Let’s also set exclusive flag to ensure only this consumer can access the messages from this queue.
rabbitConn.queue('', {exclusive: true}, function (q) {
 ..
 }
  1. Then bind the queue to chatExchange with an empty ‘Binding key’ and listen to ALL messages.
q.bind('chatExchange', "");
  1. Lastly, create a consumer (via q.subscribe) that waits for messages from RabbitMQ. And when a message comes, send it to the browser.
q.subscribe(function (message) {
   //When a message comes, send it back to browser
   socket.emit('chat', JSON.stringify(message));
 });

Putting it all together.

sessionSockets.on('connection', function (err, socket, session) {
    /**
     * When a user sends a chat message, publish it to chatExchange w/o a Routing Key (Routing Key doesn't matter
     * because chatExchange is a 'fanout').
     *
     * Notice that we are getting user's name from session.
     */
    socket.on('chat', function (data) {
        var msg = JSON.parse(data);
        var reply = {action: 'message', user: session.user, msg: msg.msg };
        chatExchange.publish('', reply);
    });

   /**
     * When a user joins, publish it to chatExchange w/o Routing key (Routing doesn't matter
     * because chatExchange is a 'fanout').
     *
     * Note: that we are getting user's name from session.
     */
    socket.on('join', function () {
        var reply = {action: 'control', user: session.user, msg: ' joined the channel' };
        chatExchange.publish('', reply);
    });


   /**
     * Initialize subscriber queue.
     * 1. First create a queue w/o any name. This forces RabbitMQ to create new queue for every socket.io connection w/ a new random queue name.
     * 2. Then bind the queue to chatExchange  w/ "#" or "" 'Binding key' and listen to ALL messages
     * 3. Lastly, create a consumer (via .subscribe) that waits for messages from RabbitMQ. And when
     * a message comes, send it to the browser.
     *
     * Note: we are creating this w/in sessionSockets.on('connection'..) to create NEW queue for every connection
   */
    rabbitConn.queue('', {exclusive: true}, function (q) {
        //Bind to chatExchange w/ "#" or "" binding key to listen to all messages.
        q.bind('chatExchange', "");

   //Subscribe When a message comes, send it back to browser
        q.subscribe(function (message) {
            socket.emit('chat', JSON.stringify(message));
        });
    });
 });

Running / Testing it on Cloud Foundry

  • Clone this app to rabbitpubsub folder
  • cd rabbitpubsub
  • npm install & follow the below instructions to push the app to Cloud Foundry
[~/success/git/rabbitpubsub]
> vmc push rabbitpubsub
Instances> 4       <----- Run 4 instances of the server

1: node
2: other
Framework> node

1: node
2: node06
3: node08
4: other
Runtime> 3  <---- Choose Node.js 0.8v

1: 64M
2: 128M
3: 256M
4: 512M
Memory Limit> 64M

Creating rabbitpubsub... OK

1: rabbitpubsub.cloudfoundry.com
2: none
URL> rabbitpubsub.cloudfoundry.com  <--- URL of the app (choose something unique)

Updating rabbitpubsub... OK

Create services for application?> y

1: blob 0.51
2: mongodb 2.0
3: mysql 5.1
4: postgresql 9.0
5: rabbitmq 2.4
6: redis 2.6
7: redis 2.4
8: redis 2.2
What kind?> 5 <----- Select & Add RabbitMQ 2.4v service (for pub-sub)

Name?> rabbit-e1223 <-- This is just a random name for RabbitMQ service

Creating service rabbit-e1223... OK
Binding rabbit-e1223 to rabbitpubsub... OK

Create another service?> y

1: blob 0.51
2: mongodb 2.0
3: mysql 5.1
4: postgresql 9.0
5: rabbitmq 2.4
6: redis 2.6
7: redis 2.4
8: redis 2.2
What kind?> 6 <----- Select & Add Redis 2.6v service (for session store)

Name?> redis-e9771 <-- This is just a random name for Redis service

Creating service redis-e9771... OK
Binding redis-e9771 to rabbitpubsub... OK

Bind other services to application?> n

Save configuration?> n

Uploading rabbitpubsub... OK
Starting rabbitpubsub... OK
Checking rabbitpubsub... OK

  • Once the server is up, open up multiple browsers and go to <servername>.cloudfoundry.com
  • Start chatting.

Tests

Test 1

  • While chatting, refresh the browser.
  • You should automatically be logged in.

Test 2

  • Open up JS debugger (On Chrome, do cmd + alt +j )
  • Restart the server by doing vmc restart <appname>
  • Once the server restarts, Socket.io should automatically reconnect
  • You should be able to chat after the reconnection.

That’s it for now. Hopefully this blog helps you get started with using RabbitMQ. Look forward for more Node.js and RabbitMQ related blogs. The content of this blog has also been covered in a video. Feel free to get in touch with us for questions on the material.


General Notes

  • Get the code right away – Github location: https://github.com/rajaraodv/rabbitpubsub.
  • Deploy right away – if you don’t already have a Cloud Foundry account, sign up for it here.
  • Check out Cloud Foundry getting started here and install the vmc Ruby command line tool to push apps.
  • To install the latest alpha or beta vmc tool run: sudo gem install vmc --pre.
Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Build a Real Time Activity Stream on Cloud Foundry with Node.js, Redis and MongoDB 2.0 – Part III

In Part II of this series, we covered the architecture needed for persisting the Activity Streams to MongoDB and fanning it out in real-time to all the clients using Redis PubSub.

Since then, some exciting new Node.js features for Cloud Foundry were launched. In addition, the MongoDB version on Cloud Foundry has been upgraded to 2.0.

In this blog post we will cover how to:

  • Use Mongoose-Auth to store basic user information, including information from Facebook, Twitter, and Github, and how we made this module with native dependencies work on Cloud Foundry
  • Use Mongo GridFS and ImageMagick to store user uploaded photos and profile pictures
  • Perform powerful stream filtering, thanks to new capabilities exposed in MongoDB 2.0
  • Update the UX of the app to become a real-time stream client using Bootstrap, Backbone.js and Jade.

Offering SSO and Persisting Users

The requirement for this boilerplate Activity Streams App was to allow users to log in with Facebook, Twitter, or Github, and to persist this user data in the database.

I am sure many of you are familiar with how to store user information in a database and perhaps even in MongoDB. What some of you may not have tried is storing third party user information like the one we obtain when users log in with Facebook, Twitter or Github. In a relational database, you would probably store this information across multiple records in multiple tables (e.g., users, services, accounts, or auth). However, with MongoDB you can use embedded documents and store all this information in a single document, thus reducing the complexity of the operation. I found that this was very easy to do using @bnoguchi’s mongoose-auth which decorates the Mongoose User Schema with the third party services fields.

My previous version of the app was using another popular module from Brian called everyauth which handled SSO very well, but it did not persist the user info. It was fairly straightforward to upgrade from everyauth to mongoose-auth.

First, I updated a helper module I created called activity-streams-mongoose to offer a User Schema and made it possible to extend all schemas. Then I loaded mongoose-auth and decorated that schema with the specifics needed. You can see the exact code changes in this diff. The key part in the upgrade was to normalize the user data as it was saved. I did this by leveraging the pre-save callback MongooseJS offers.

Here is a snippet of the code:

var streamLib = require('activity-streams-mongoose')({
  mongoUrl: app.siteConf.mongoUrl,
  redis: app.siteConf.redisOptions,
  defaultActor: defaultAvatar
});

var authentication = new require('./authentication.js')(streamLib, app.siteConf);

// Moved normalization to only be done on pre save
streamLib.types.UserSchema.pre('save', function (next) {
  var user = this;
  var svcUrl = null;
  if (user.fb && user.fb.id) {
    user.displayName = "FB: " + user.fb.name.full;
    asmsDB.ActivityObject.findOne().where('url', facebookHash.url).exec(function(err, doc){
      if (err) throw err;
      user.author = doc._id;
      // Need to fetch the users image...
      https.get({
        'host': 'graph.facebook.com'
        , 'path': '/me/picture?access_token='+ user.fb.accessToken
      }, function(response) {
        user.image = {url: response.headers.location};
        next();
      }).on('error', function(e) {
        next();
      });
    })
  } else  {
    if (user.github && user.github.id) {
      user.displayName = "GitHub: " + user.github.name;
      var avatar = 'http://1.gravatar.com/avatar/'+ user.github.gravatarId + '?s=48'
      user.image = {url: avatar};
      svcUrl = githubHash.url;
    } else if (user.twit && user.twit.id) {
      user.displayName = "Twitter: " + user.twit.name;
      user.image = {url: user.twit.profileImageUrl};
      svcUrl = twitterHash.url;
    }

    if(!user.actor) {
      asmsDB.ActivityObject.findOne().where('url', svcUrl).exec(function(err, doc){
        user.author = doc;
        next();
      });
    } else {
      next();
    }
  }
  });

var asmsDB = new streamLib.DB(streamLib.db, streamLib.types);
streamLib.asmsDB = asmsDB;

MongoDB 2.0 and Cloud Foundry

For those of you not familiar with MongoDB 2.0, one neat feature is that it supports Multi-location Documents. I was also able to add a location property to the core Activity Object in the activity-streams-mongoose module which can be used with the Activities, Activity Objects and User collections to allow performing Geo Queries.

var LocationHash = {
  displayName: {type: String},
  position: {
    latitude: Number,
    longitude: Number
  }
};

var ActivityObjectHash = {
  id: {type: String},
  image: {type: MediaLinkHash, default: null},
  icon: {type: MediaLinkHash, default: null},
  displayName: {type: String},
  summary: {type: String},
  content: {type: String},
  url: {type:String},
  published: {type: Date, default: null},
  objectType: {type: String},
  updated: {type: Date, default: null},
  location: LocationHash,
  fullImage : {type: MediaLinkHash, default: null},
  thumbnail : {type: MediaLinkHash, default: null},
  author : {type: ObjectId, ref: "activityObject"},
  attachments : [{type: ObjectId, ref: 'activityObject'}],
  upstreamDuplicates : [{type: String, default: null}],
  downstreamDuplicates : [{type: String, default: null}]
};

Another interesting geo feature in MongoDB 2.0 is polygonal search. This means that you can search whether a given object is in a specified area by providing the area points. For example, this can be helpful when you want to see if certain objects, like houses, are in a certain zip code.

Working with Node.js Modules with Native Dependencies

Mongoose-Auth requires a module bcrypt which has a native dependency (it gets compiled locally when you do a Node Package Manager (npm) install and the binary placed in the mode_modules directory). If you are working from a Mac or Windows and deploying to the cloud you can run into issues by including your node_modules folder. Luckily, now there is support in Cloud Foundry for excluding the node_modules folder and having Cloud Foundry fetch and build the npm modules server-side.

$ npm shrinkwrap
$ echo "{ignoreNodeModules: true}" > cloudfoundry.json
$ vmc update

For more info you can read this blog post from Maria and/or watch this helpful video from Raja.

My recommendation is that when you start a new Node.js App make sure to add the cloudfoundry.json file with skip node_modules folder set to true so all the native dependencies are built directly on Cloud Foundry. Also don’t forget to run npm shrinkwrap if you change package.json.

User Uploaded Photos with ImageMagick and Mongo GridFS

One of the most engaging objects to show in a web app like this
Activity Stream boilerplate app are photos. Apps like Instagram and Pinterest have taken photo sharing to a whole new level and have completely redesigned the UX of photo feeds. We wanted to help developers build Activity Stream apps with rich photo sharing and thus needed a library to help us manipulate images and a place to store all these images. Since we were already using MongoDB, I decided to leverage Mongo GridFS to store the images.

I had previously worked storing photos in GridFS but it was from Ruby. It was a little bit more challenging to find the right tools in Node.js. I found a lot of npm modules which seemed to handle it for me, but found that they were either unfinished or loaded several additional components which were incompatible. I really wanted to keep it simple so I ended up following the documentation on the official Node.js Mongo DB driver and creating a few routes to handle creating photos and viewing photos.

Here is how I ingested the photos into Mongo GridFS:

var im = require('imagemagick');
var Guid = require('guid');
var siteConf = require('./lib/getConfig');
var lib = new require('./lib/asms-client.js')(app, cf).streamLib;

function ingestPhoto(req, res, next){
  if (req.files.image) {
    im.identify(req.files.image.path, function(err, features){
      if (features && features.width) {
        var guid = Guid.create();
        // Concatenating name to guid guarantees that we always have
        // unique file names
        var fileId = guid + '/' + req.files.image.name;
        // The GridStore class is the equivalent of a File class but has the
        // added benefit of allowing you to store metadata
        var gs = lib.GridStore(lib.realMongoDB, fileId, "w", {
          content_type : req.files.image.type,
          // metadata is optional
          metadata : {
            author: req.session.user._id,
            public : false,
            filename: req.files.image.name,
            path: req.files.image.path,
            width: features.width,
            height: features.height,
            format: features.format,
            size_kb: req.files.image.size / 1024 | 0
          }
        });
         // This command copies the file from the file system(temp dir)
         // to GridFS.
         // GridFS supports any file size by breaking it into chunks
         // behind the scenes
        gs.writeFile(req.files.image.path, function(err, doc){
          if (err) {
            next(err);
          } else {
            if (! req.photosUploaded) {
              req.photosUploaded = {};
            }
            // I have another express route to serve the photos by fileId
            var url = siteConf.uri + "/photos/" + fileId;
            // Add the results of the upload to the chain
            req.photosUploaded['original'] = {url : url, metadata: gs.metadata};
            req.nextSizeIndex = 0;
            next();
          }
        });
      } else {
        if (err) throw err;
        throw(new Error("Cannot get width for photo"));
      }
    });
  } else {
    next(new Error("Could not find the file"));
  }
};

The above snippet shows how I used ImageMagick to get the photo dimensions. ImageMagick is an amazing open source software suite for manipulating images and there is an easy to use ImageMagick node module to expose its functionality.

An open source project backed by years of continual development, ImageMagick supports about 100 image formats and can perform impressive operations such as creating images from scratch; changing colors; stretching, rotating, and overlaying images; and overlaying text on images.

ImageMagick.org

And this snippet shows how to produce a new image of smaller size:

var im = require('imagemagick');
var Guid = require('guid');

function reducePhoto(req, res, next){
    var photoIngested = req.photosUploaded['original'];
    if (photoIngested) {
        var sizeName = sizes[req.nextSizeIndex].name;
        var destPath = photoIngested.metadata.path + '-' + sizeName ;
        var nameParts = photoIngested.metadata.filename.split('.');
        var newName = nameParts[0] + '-' + sizeName + '.' + nameParts[1];
        var width = sizes[req.nextSizeIndex].width;

        im.resize({
          srcPath: photoIngested.metadata.path,
          dstPath: destPath,
          width:   width
        }, function(err, stdout, stderr){
          if (err) {
              next(err);
          } else {
            console.log("The photo was resized to " + width + "px wide");
            var guid = Guid.create();
            var fileId = guid + '/' + newName;
            var ratio = photoIngested.metadata.width / width;
            var height = photoIngested.metadata.height / ratio;
            var gs = asmsClient.streamLib.GridStore(asmsClient.streamLib.realMongoDB, fileId, "w", {
                  content_type : req.files.image.type,
                  metadata : {
                      author: req.session.user._id,
                      public : false,
                      filename: newName,
                      width: width,
                      height: height,
                      path: destPath
                  }
              });
              gs.writeFile(destPath, function(err, doc){
                  if (err) {
                    next(err);
                  } else {
                      var url = siteConf.uri + "/photos/" + fileId;
                      req.photosUploaded[sizeName] = {url : url, metadata: gs.metadata};
                      req.nextSizeIndex = req.nextSizeIndex + 1;
                      next();
                  }
              });
          }
        });
    }
};

The only gotcha on Cloud Foundry was that it did not set for us the environment variable TMP which is used by the formidable module to offer a temp directory where the files are first uploaded. Once I set it using env-add the problem was solved.

bash-3.2$ vmc files asms app/tmp

   36272c476f10ecbf0e3a99481a8d365b         50.6K

All apps on Cloud Foundry have permissions to write files to its own directory or a subdirectory. Set the environment variable TMP to a subdir if you are working with express/formidable to have it handle form uploads.

More Powerful Queries with MongoDB
Part of the beauty of the ActivityStrea.ms format is that it provides a lot of metadata about each activity which can then be searched, aggregated and pivoted to draw interesting conclusions about certain topics and trends.

Examples of these fields are: Hashtags or Topics, Verb, Object Types, Actor Types and Location.

The first step to allowing users to analyze the stream data is providing them with a map of their universe. This means allowing them to see all the possible values for each field. For example, if we are talking about hashtags, then we would show our users that the population has used so far ten hashtags. We would then reveal the distribution in usage across everyone. Then we would provide our users with the ability to drill in by segmenting via actor type or location, for example. This could yield interesting results showing where certain topics are most popular.

var getDistinct = function (req, res, next, term, init){
  var key = 'used.' + term;
  req[key] = init ? init : [];
  var query = {streams: req.session.desiredStream};
  asmsDB.Activity.distinct(term, query, function(err, docs) {
    if (!err && docs) {
      _.each(docs, function(result){
        req[key].push(result);
      });
      next();
    } else {
      next(new Error('Failed to fetch distinct ' + term));
    }
  });
}

//..

function getDistinctVerbs(req, res, next){
  getDistinct(req, res, next, 'verb');
};

function getDistinctActors(req, res, next){
  getDistinct(req, res, next, 'actor');
};

function getDistinctObjects(req, res, next){
  getDistinct(req, res, next, 'object', ['none']);
};

function getDistinctObjectTypes(req, res, next){
  getDistinct(req, res, next, 'object.object.type', ['none']);
};

function getDistinctActorObjectTypes(req, res, next){
  getDistinct(req, res, next, 'actor.object.type', ['none']);
};

//...

app.get('/streams/:streamName', loadUser, getDistinctStreams, getDistinctVerbs, getDistinctObjects, getDistinctActors,
  getDistinctObjectTypes, getDistinctActorObjectTypes, getDistinctVerbs, getMetaData, function(req, res) {

    asmsClient.asmsDB.Activity.getStream(req.params.streamName, 20, function (err, docs) {
    var activities = [];
    if (!err && docs) {
      activities = docs;
    }
    req.streams[req.params.streamName].items = activities;
    var data = {
      currentUser: req.user,
      streams : req.streams,
      desiredStream : req.session.desiredStream,
      actorTypes: req.actorTypes,
      objectTypes : req.objectTypes,
      verbs: req.verbs,
      usedVerbs: req['used.verb'],
      usedObjects: req['used.object'],
      usedObjectTypes: req['used.object.type'],
      usedActorObjectTypes: req['used.actor.object.type'],
      usedActors: req['used.actor']
    };
    if (req.is('json')) {
      res.json(data);

    } else {
       res.render('index', data);
    }
  });

});

A Robust Activity Stream UX

The initial node-express-boilerplate app had some basic jQuery used to show plain text messages and users’ photos. In the new app, we have much richer messages and the ability to post and filter them. For this reason, we decided to use some of the great client-side open source tools available today.

After some consideration, we ended up using these three tools:

  1. Backbone.js: A lightweight client-side MVC framework
  2. Bootstrap: A set of CSS, HTML and Javascript components which help developers produce great looking apps, without needing to start from scratch
  3. Jade: A templating language with the help of ClientJade

Templating, Markup and CSS

If you are a web developer, you probably know that there are many choices in tools to render HTML dynamically. A good number of web developers prefer to use templating engines to render HTML because these help you produce more readable code. Using declarative programming, you can interpolate variables and directives in the HTML. The most popular templating engine for Node.js is Embedded JavaScript (EJS), which resembles ERB in Ruby. This is what the node-express-boilerplate project included. When I started working with Node.js, I found many more choices that were not present in the Ruby world such as: Mustache, Handlebars, Dust and Jade. In fact, LinkedIn wrote an excellent blog post discussing the many alternative choices for templating engines.

I ended up selecting Jade because I already liked HAML, which is very similar to Jade in its terseness. Both templating languages use indentation to understand the hierarchy of elements. Jade is even terser than HAML because it removes the need to put % in front of the HTML tags.

Another cool thing about Jade is that it already had support for server-side and client-side rendering via ClientJade. Here is how I broke out the views allowing easy extension of the object types.

I then compiled the Jade views into js for faster client-side rendering with ClientJade.

clientjade views/*.* > public/js/templates.js

Once this was done, I simply included templates.js in the list of files to be minified and used it like this from Backbone:

var ActivityCreateView = Backbone.View.extend({
    el: '#new_activity',
    initialize: function(){
        _.bindAll(this, 'newAct', 'render', 'changeType', 'includeLocation', 'sendMessage');

        this.trimForServer = App.helper.trimForServer;

        var streamName = this.$el.find('#streamName').val();
        var verb = this.trimForServer(this.$el.find('#verb-show'));
        var objectType = this.trimForServer(this.$el.find('#object-show'));

        this.newAct(streamName, verb, objectType);
        this.render();
    },
    events: {
        "click .type-select" : "changeType",
        "click #includeLocation" : "includeLocation",
        "click #send-message" : "sendMessage"
    },
    newAct : function(streamName, verb, objectType) {
        this.streamName = streamName;
        this.model = new Activity({
            object: {content: '', objectType: objectType, title: '', url: ''},
            verb: verb,
            streams: [streamName]
        });
    },
    render: function(){
      var actData = this.model.toJSON();
      this.$el.find("#specific-activity-input").html(jade.templates[actData.object.objectType]());

      return this; // for chainable calls, like .render().el
    },
    changeType : function(event) {
        console.log(event);
        var itemName = $(event.target).data("type-show");
        if (itemName) {
            $("#" + itemName)[0].innerHTML = event.target.text + "  ";
            var val = this.trimForServer(event.target.text);
            if (itemName == "verb-show") {
                this.model.set('verb', val);
            } else {
                var obj = this.model.get('object');
                obj.objectType = val;
                this.model.set('object', obj);
            }
        }
        this.render();
    },
    includeLocation : function(event) {
        if (navigator.geolocation) {
            navigator.geolocation.getCurrentPosition(App.helper.getLocation);
        } else {
            alert("Geo Location is not supported on your device");
        }
    },
    sendMessage : function() {
        console.log("In send message");

        var obj = this.model.get('object');
        obj.content = $("#msg").val();
        obj.url = $('#url').val();
        obj.title = $('#title').val();
        obj.objectType = this.trimForServer($('#object-show'));
        this.model.set('object', obj);

        var streamName = $('#streamName').val();
        this.model.set('streams', [streamName]);

        var verb = this.trimForServer($('#verb-show'));
        this.model.set('verb', verb);

        if (this.model.isValid()) {
            if (this.model.save()) {
                this.newAct(streamName, verb, obj.objectType);
                this.render();
            }
        }

    }

});

The original node-express-boilerplate app was using 960gs and jQuery. However, this activity streams boilerplate app is a bit more complex so I switched to using Twitter’s Bootstrap as the first step. This provided me a nice way to the nav bar, hero unit, modals, drop downs and so on. Also it was easy enough to go from using one grid system to another. For the moment the app is using the default grid system but it can easily be made to use Bootstrap’s Fluid Layout and Responsive Design enhancements.

Manipulating Data on the Client with Backbone.js

Instead of having a large number of individual jQuery handlers on HTML elements, Backbone.js helps you break your UX apart into components called Views which are more similar to Controllers when thinking of server-side MVC frameworks. These Backbone views can take Backbone models and templates and render them, as well as listen for events on the elements that comprise the view. You can see in the snippet above we have a Backbone view that works with a Backbone model for an Activity.
Backbone Models are pretty simple classes you create by detailing all the properties for the model and validation rules. Here is the code for the Activity Backbone Model which is also used to create the ActivityStreamView:

var Activity = Backbone.Model.extend({
    url : "/activities",
    // From activity-streams-mongoose/lib/activityMongoose.js
    defaults: {
        verb: 'post',
        object: null, //ActivityObject
        actor: null, //ActivityObject
        url: '',
        title: '',
        content: '',
        icon: null, // MediaLinkHash
        target: null, //ActivityObject
        published: Date.now,
        updated: Date.now,
        inReplyTo: null, //Activity
        provider: null, //ActivityObject
        generator: null, //ActivityObject
        streams: ['firehose'],
        likes: {},
        likes_count: 0,
        comments: [],
        comments_count: 0,
        userFriendlyDate: 'No idea when'
    },
    validate: function(attrs) {

    if (! attrs.object) {
        return "Object is missing"
    }
    if (!attrs.object.title) {
      return "Title is missing";
    }
  }
});

You can then easily instantiate passing a javascript bare object. In the example below, I took the output from socket.io when a message comes in, converted it to an object and added it to the Backbone collection associated with the Stream View:

var streamView = new ActivityStreamView();
App.socketIoClient.on('message', function(json) {
  var doc = JSON.parse(json);
    if (doc) {
      streamView.collection.add(new Activity(doc));
    }
});

Working with Backbone.js was very fun but it definitely takes some time to convert all your logic to using Backbone Views and Models. In this example, I only used Backbone.js for a subset of the app. Having Jade as my templating language and Node.js allowed me to share code between server and client. If you have a trivial application, you may not need to use Backbone.js and may be able to keep it simple with express and server side templates. In the case of this Activity Streams application, which syndicates in real time and offers the ability to react to any new item, it made sense to use Backbone.js. It also allowed me to provide the hooks for the next iteration of this app. After all, this is a boilerplate app.

Remember that it is very easy to push updates to your application on Cloud Foundry as you make progress doing:

bash-3.2$ vmc update

Updating application 'asms'...
Uploading Application:
  Checking for available resources: OK
  Processing resources: OK
  Packing application: OK
  Uploading (380K): OK   
Push Status: OK
Stopping Application 'asms': OK
Staging Application 'asms': OK                                                  
Starting Application 'asms': OK

Conclusion

As a contributor to the ActivityStrea.ms specification, I find it necessary (and fun) to get my hands dirty building the apps which use open standards to see where there are limitations and what technologies can make it easier. Working with MongoDB proved to be the right choice, giving me the ability to do complex queries, aggregation and full modeling of my objects which are needed for quickly painting the stream. I am really happy that MongoDB 2.0 is now running on Cloud Foundry because a lot of the Object Document Mappers like Mongoid in Ruby only support 2.x. This is a very exciting time to be a developer, as things are moving very fast and there is ample opportunity to make a difference via open source.

Here is the final architecture of the app at a very high-level:

Activity Streams Boilerplate App Architecture

To try this application, you can visit it here. To get the full code, you can clone or fork it here.

–Monica Wilkinson

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

New Runtime Module for Node.js Applications

In the previous blog post, Cloud Foundry Now Supports Auto-Reconfiguration for Node.js Applications, we saw that Node.js applications deployed to CloudFoundry.com can be automatically reconfigured to connect to Cloud Foundry services. However, there may be situations where you want to opt-out of that feature to have finer control over service connections or to overcome its limitations. In those cases, your applications would be required to parse JSON-formatted environment variables to perform that same function. While this is not overly complex given that JSON is trivial to parse with JavaScript, you will be required to understand the environment variable names and their payload schema. The new cf-runtime module introduced in this blog simplifies this by providing a way to obtain application information and service connection objects. This module moves Cloud Foundry’s Node.js support forward to the match the support for Java and Ruby applications.

Installation

Cf-runtime is available in the npm registry and can be easily installed with the Node Package Manager (npm). Run the following command in the base directory of your Node.js application.

npm install cf-runtime

Usage

This node module provides access to two types of objects. The first is a preconfigured object named CloudApp that contains application information. This includes the application’s host and port configured by Cloud Foundry, list of services bound to the application and their properties.

Additionally, each service that is bound to the application can be accessed via <ServiceName>Client object (i.e. RedisClient, MysqlClient). This object provides a convenient way to obtain the corresponding service connection with just a single function call. You can either create a service connection by the name used to create the service instance or by providing a general service name (e.g. redis or mongo) if there is only one service of this type that is bound to your application. This function may also accept additional parameters depending on the node module it uses (see details below in Service Clients section).

var cf = require('cf-runtime')
var app = cf.CloudApp

// Check if application is running in Cloud Foundry

app.runningInCloud

// Get application properties

app.host
app.port

// Get the list of application service names

app.serviceNames

// Obtain connection properties for single service of type Redis

app.serviceProps.redis

// Obtain connection properties for service named 'redis-service-name'

app.serviceProps['redis-service-name']

// Obtain the list of service names of specific type

app.serviceNamesOfType.redis

// Check if service of the given type is available

cf.RedisClient !== undefined

// Connect to a single service of type Redis

var redisClient = cf.RedisClient.create()

// Connect to redis service named 'redis-service-name'

var redisClient = cf.RedisClient.createFromSvc('redis-service-name')

Service Properties

All services have the following common properties:

  • name: specific name of the service
  • label: name of service type, for example “redis”, “mysql”
  • version: software version of the service type
  • host
  • port
  • username
  • password
  • url: service connection url

Additionally, PostgreSQL, MySQL and Redis include this service property:

  • database: the name of the database that is provided by the service

RabbitMQ provides access to these additional properties:

  • vhost: the name of the virtual host

MongoDB provides access to these additional properties:

  • db: the database name

Service Clients

The following table shows the available methods and parameters for each service type:

Node module Returns Functions Parameters
MongoDB
mongodb null cf.MongoClient.create([options], callback)
cf.MongoClient.createFromSvc(name, [options], callback)
name- the name of a service bound to the appoptions- optional {object} non-connection related optionscallback – {function} connection callback
MySQL
mysql Mysql client instance cf.MysqlClient.create([options])
cf.MysqlClient.createFromSvc(name, [options])
name- the name of a service bound to the appoptions – optional {object} non-connection related options
PostgreSQL
pg {boolean} cf.PGClient.create(callback)
cf.PGClient.createFromSvc(name, callback)
name- the name of a service bound to the appcallback – {function} connection callback
RabbitMQ
amqp AMQP client instance cf.AMQPClient.create([implOptions])
cf.AMQPClient.createFromSvc(name, [implOptions])
name- the name of a service bound to the appimplOptions – optional {object} non-connection related implementation options
Redis
redis Redis client instance cf.RedisClient.create([options])
cf.RedisClient.createFromSvc(name, [options])
name- the name of a service bound to the appoptions – optional {object} non-connection related options

Summary

The main purpose of cf-runtime is to make your Node.js applications understand their cloud better, retrieve the environment properties, find the available services, and connect to them easily. If you are writing Node.js applications, cf-runtime just made it easier to deploy your applications to Cloud Foundry.

- Maria Shaldibina
The Cloud Foundry Team

Don’t have a Cloud Foundry account yet?  Sign up for free today

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Cloud Foundry Now Supports Auto-Reconfiguration for Node.js Applications

Cloud Foundry has long supported auto-reconfiguration for Spring and Ruby applications. Now we are pleased to add auto-reconfiguration support for Node.js applications as well. Deploying Node.js applications to Cloud Foundry previously required parsing of environmental variables and overwriting server and service connection function calls to use Cloud Foundry specific parameters. This approach was not intuitive to developers who just started to use Cloud Foundry to deploy their applications. They would need to consult the documentation and figure out what port and host they need to connect to. Moreover, if an application uses services, developers would need to configure their applications to use the proper service connection parameters.

Auto-Reconfiguration in Action

Let’s look at the basic Node.js application. We are going to take some sample code from the Node.js official website homepage and save it to a file called app.js:

var http = require('http');
http.createServer(function (req, res) {
 res.writeHead(200, {'Content-Type': 'text/plain'});
 res.end('Hello World\n');
}).listen(1337, '127.0.0.1');
console.log('Server running at http://127.0.0.1:1337/');

As we can see, this code sets up your server to listen on your local port 1337. What if we now push this application to CloudFoundry.com ‘as-is’?

$ vmc push example-app
Would you like to deploy from the current directory? [Yn]:
Detected a Node.js Application, is this correct? [Yn]:
Application Deployed URL [example-app.cloudfoundry.com]:
Memory reservation (128M, 256M, 512M, 1G, 2G) [64M]:
How many instances? [1]:
Bind existing services to 'example-app'? [yN]:
Create services to bind to 'example-app'? [yN]:
Would you like to save this configuration? [yN]:
Creating Application: OK
Uploading Application:
 Checking for available resources: OK
 Packing application: OK
 Uploading (0K): OK   
 Push Status: OK
Staging Application 'example-app': OK                                           
Starting Application 'example-app': OK                                          

$ curl example-app.cloudfoundry.com
Hello World

We can see that the application is up and running. But how is this possible if we didn’t configure it to listen on a Cloud Foundry application-specific port? This is when auto-reconfiguration comes into play. It automatically detects and modifies server and service connection parameters, so that the application can run and connect to Cloud Foundry services without manually specifying configuration values. As a result, an application that is developed and tested locally can work seamlessly on CloudFoundry.com without any code changes.

This was only a basic example of auto-reconfiguration in action. Let’s take a look at a more complex application that needs a database service to run. We are going to create our application using the content management system Calipso. It is based on the Express framework and uses the MongoDB database.

First, we pull Calipso source from Github and install its dependencies. As Calipso depends on a native module, bcrypt, we should use Cloud Foundry’s npm support feature that recently became available. Following that blog post on npm support we create npm_shrinkwrap.json and set “ignoreNodeModules” in cloudfoundry.json.

That’s it! Our application is ready to be deployed to CloudFoundry.com. As we deploy the application, we will be creating and binding a MongoDB service to the application.

$ vmc push calipso-app --runtime=node06
Would you like to deploy from the current directory? [Yn]:
Detected a Node.js Application, is this correct? [Yn]:
Application Deployed URL [calipso-app.cloudfoundry.com]:
Memory reservation (128M, 256M, 512M, 1G, 2G) [64M]: 128M
How many instances? [1]:
Bind existing services to 'calipso-app'? [yN]:
Create services to bind to 'calipso-app'? [yN]: y
1: mongodb
2: mysql
3: postgresql
4: rabbitmq
5: redis
6: vblob
What kind of service?: 1
Specify the name of the service [mongodb-c88a9]:
Create another? [yN]:
Would you like to save this configuration? [yN]:
Creating Application: OK
Creating Service [mongodb-c88a9]: OK
Binding Service [mongodb-c88a9]: OK
Uploading Application:
 Checking for available resources: OK
 Processing resources: OK
 Packing application: OK
 Uploading (95K): OK   
Push Status: OK
Staging Application 'calipso-app': OK                                           
Starting Application 'calipso-app': OK

As you can see from the output, the application was deployed successfully. If we go to its homepage we can see a welcome message from Calipso where we confirm that we are “awesome”!


Now we can follow the installation wizard steps. With auto-reconfiguration it means that we can just use any value, including the default, for the database setup.

After the database is set up, we are ready to create a new article on our blog.

And we can see that the connection to the data service is functioning, as the new article is published to our blog:

To recap, we downloaded the application source, set up its dependencies, and deployed it to CloudFoundry.com using default connection parameters. The result is a working application. Let’s look now at the technical details on how this was accomplished.

Under the Hood

When your application is staged during the deployment process, Cloud Foundry makes two modifications:

  • Add a cf-autoconfig node module to the application
  • Preload the cf-autoconfig module while bootstrapping your application

The cf-autoconfig module uses the Node.js caching mechanism for module loading. Once a module is loaded, it is cached and requiring the same module elsewhere will take advantage of the cached code. The cf-autoconfig module searches for popular modules node.js applications use for connecting to services. It loads them before application code to redefine functions that connect to a service. Each modification replaces the original connection parameters (host, port, credentials, etc.) with equivalent parameters associated with a matching Cloud Foundry service bound to the application. With this arrangement in place, when application code subsequently loads the same module, attempts to connect to a service will yield a connection to an appropriate Cloud Foundry service. For an example, let’s see how it redefines the connect function of the MongoDB node module:

      if ("connect" in moduleData) {
        var oldConnect = moduleData.connect;
        var oldConnectProto = moduleData.connect.prototype;

        moduleData.connect = function () {
          var args = Array.prototype.slice.call(arguments);
          args[0] = props.url;
          return oldConnect.apply(this, args);
        };
        moduleData.connect.prototype = oldConnectProto;
      }

Other functions are redefined the same way. Take a look at the cf-autoconfig module’s source code on Github, and feel free to provide feedback or even a pull request.

Supported Modules

The following is the list of supported modules:

According to search.npmjs.org, most Node.js applications and other modules are dependent on the modules listed above. By providing support for these popular modules, any other modules that use them to form the database connection layer will inherit the benefit of auto-reconfiguration.

Limitations

Auto-reconfiguration of services works only under the following conditions:

  • You are only using one service of a given type. For example, only one mysql or one redis service.
  • You are using service node module from the list of supported modules above, or one that is based on a supported node module for service connections.
  • Your application does not use cf-runtime or cf-autoconfig node modules directly.
  • Your application is a typical Node.js application. (For a complex application you may want to consider opting-out of auto-reconfiguration and using the cf-runtime node module instead, which will be described in the next blog post in this series.)

Opting Out of Auto-Reconfiguration

Auto-reconfiguration can be turned off by providing cloudfoundry.json file in application base folder with the option “cfAutoconfig” set as false.

{ “cfAutoconfig” : false }

In addition, as mentioned above, auto-reconfiguration will not work if the application is using the cf-runtime node module.

Summary

Using the auto-reconfigurationis a great way to quickly start deploying Node.js applications to Cloud Foundry. As your application grows and demands more precise control over its services you may need to consider using the cf-runtime node module to get easy access to application properties and services. In the next blog post we are going to show you how to use the cf-runtime node module to simplify connections to Cloud Foundry services.

- Maria Shaldibina

The Cloud Foundry Team

Don’t have a Cloud Foundry account yet?  Sign up for free today

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Building a Real-Time Activity Stream on Cloud Foundry with Node.js, Redis and MongoDB–Part II

In Part I of this series, we showed how to start from the node-express-boilerplate app for real-time messaging and integration with third parties and move towards building an Activity Streams Application via Cloud Foundry. The previous app only sent simple text messages between client and server, but an Activity Streams Application processes, aggregates and renders multiple types of activities. For this new version of the application, my requirements were to show the following:

  • User interactions with the app running on CloudFoundry.com
  • Custom activities created by users on the app’s landing page
  • User activities from GitHub such as creating repositories or posting commits

For this reason, I decided to use Activity Strea.ms which is a generic JSON format to describe social activity around the web. The main components of an activity as defined by the activitystrea.ms spec are: Actor, Verb and Object with an optional Target. An actor performs an action of type verb on an object in a target. For example:

John posted SCALA 2012 Recap on his blog John OSS
10 minutes ago

{
  "published": "2012-05-10T15:04:55Z", 
  "actor": {
    "url": "http://example.org/john", 
    "objectType" : "person",
    "id": "tag:example.org,2011:john", 
    "image": {
    "url": "http://example.org/john/image", 
      "width": 250,
      "height": 250
    },
    "displayName": "John" 
  },
  "verb": "post", 
  "object" : {
    "objectType" : "blog-entry",
    "displayName" : "SCALA 2012 Recap",
    "url": "http://example.org/blog/2011/02/entry", 
    "id": "tag:example.org,2011:abc123/xyz"
  },
  "target" : {
    "url": "http://example.org/blog/", 
    "objectType": "blog",
    "id": "tag:example.org,2011:abc123", 
    "displayName": "John OSS"
  }
}

This generic vocabulary not only helps us transmit activities across apps, but it can also help us model our data store in a flexible fashion and help us with rendering the final HTML or textual output for the client.

I will talk in more detail about how this format helped me with rendering the UX in the next post, but let’s first start discussing the design for the backend and how we arrived at a data flow which includes MongoDB and Redis PubSub:

Modified Architecture

While the initial architecture worked well for a small scale, a real world app must be able to scale to meet the demands.

Persisting the Data

One of the key decisions I had to make was how to store these activities. From my previous experience on a team building a large scale Activity Streams application, I know that you have to optimize for views as the streams are typically placed in the most visited parts of websites, like home pages.

For activity streams, it is best to store all of the information needed to render an activity using a single simple query. Otherwise, given the variety of the activities, you will be joining to many tables and not be able to scale, especially if you are aggregating a variety of actions. Imagine, for example, wanting to render a stream with activities about open source contributions and bugs. In a relational database system you would have tables for:

  • users
  • projects
  • commits
  • bugs

And then you could query this data using an object document mapper like mapper:

User.hasMany("bugs", Bug, "creatorId");
Bug.belongsTo("user", User, "creatorId");

Project.hasMany("bugs", Bug, "projectId");
Bug.belongsTo("project", Project, "projectId");

User.hasMany("commits", Commit, "creatorId");
Commit.belongsTo("user", User, "creatorId");

Project.hasMany("commits", Commit, "projectId");
Commit.belongsTo("project", Project, "projectId");

Bug
  .select('createdAt', 'creatorId', 'name', 'url', 'description')
  .page(0, 10)
  .order('createdAt DESC')
  .all(function(err, bugs) {
   Commit
    .select('createdAt', 'creatorId', 'name', 'url', 'description')
    .page(0, 10)
    .order('createdAt DESC')
    .all(function(err, commits) {
      // coalesce
      ordered = coalesce(bugs, commits, 10);
      uniqueUsers = findUniqueUsers(ordered);
      uniqueProjects = findUniqueProjects(ordered);
      User
        .select('id', 'name', 'url', 'avatar_url')
        .where({ 'id.in': uniqueUsers })
        .all(function(err, users) {
        Project
          .select('id', 'name', 'url', 'avatar_url')
          .where({ 'id.in': uniqueProjects })
          .all(function(err, projects) {
            // finally, now correlate all the data back together
            var activities = [];
            //...
    
        });
      });
    });
  });

As you can see, a classic RDBMS design with very normalized data requires multiple lookups and roundtrips to the server or joins. Even if we had an activities table we would have to do a separate lookup for bugs, commits, users and projects.

With a document database, instead of having multiple lookups you have one or two lookups at most, particularly if you are querying by a field which is indexed. Therefore, a document store is a better fit for my use case than a relational database.

The node-express-boilerplate app did not include a persistence layer. I decided to use MongoDB to store each activity as a document because of the flexibility in schema. I knew I was going to be working with third party data, and I wanted to iterate quickly on the external data we incorporated. The above JSON activity can be stored in its entirety as an activities document and extended. You may notice that this is a very denormalized mechanism for storing data and could cause issues if we needed to update the objects. Luckily, since activities are actions in the past this is not as big of an issue.

Assuming we have an activities collection where the activity document has nested actors and objects you can write code like:

// https://github.com/ciberch/activity-streams-mongoose

 Activity.find().sort('published', 'descending').limit(10).run(
   function (err, docs) {
        var activities = [];
        if (!err && docs) {
            activities = docs;
            res.render('index', {activities: activities});
        }       
    });

});

One of the greatest aids in this project was MongooseJS, which is an Object Document Mapper for Node.js. Mongoose exposes wrapper functions to use MongoDB with async callbacks and easily model schema as well as validators. With Mongoose I was able to define the schema in a few lines of code.

Scaling the real-time syndication

One of the issues with the boilerplate code is that socket.io cannot syndicate messages to other recipients that are connected to a different web server since it stores all the messages in memory. The most logical thing to do was to put in place a proper queueing system that all web servers could connect to. Redis PubSub was my first choice as it is extremely easy to use. As soon as I successfully saved an activity to MongoDB, I streamed it into the proper channel for all subscribers to receive. This was extremely easy to use since we are working with JSON everywhere:


var redis = require("redis");
var publisher = redis.createClient(options.redis.port, options.redis.host);
if(options.redis.pass) {
  publisher.auth(options.redis.pass);
}
 
function publish(streamName, activity) {
  activity.save(function(err) {
  if (!_.isArray(activity.streams)) {
     activity.streams = []
   }
   if (!_.include(activity.streams, streamName)) {
     activity.streams.push(streamName);
   }
   if (!err && streamName && publisher) {
      // Send to Redis PubSub
      publisher.publish(streamName, JSON.stringify(activity));
   }
  });
}

This methodology is particularly useful when you have predefined aggregation methods, such as tags or streams.

Packaging as a Module

One of the great things about the Node.js community is the fact that its very easy to contribute to the Open Source Community thanks to NPM and its Registry. I could not find any lightweight activity stream libraries, so I went ahead and submitted the persistence logic as a new module: activity-streams-mongoose.

Once you have a proper package.json, you can just do this command to publish it.

npm publish

Once you have the module published you can follow the steps outlined in this pull request: 

https://github.com/ciberch/node-express-boilerplate/pull/1/ to get your app upgraded to persist activities. You can easily run this app on CloudFoundry.com by creating and binding Redis and MongoDB instances as you deploy your application. Furthermore, scaling the app can be simply done with the ‘vmc instances‘ command.

Conclusion

It is important to take time and select the proper database type for the application you are building. While RDBMS systems are the most popular, they are not always the best for the job. In this scenario, using a document store, namely MongoDB, helped us increase scalability and write simpler code.

Another step in taking your app to the cloud is making it stateless so that if instances are added or deleted, users don’t lose their sessions or messages. For this app, using Redis PubSub helped us solve the challenge of communicating across app instances. Finally, contributing to open source initiatives can not only save you time, but can also get more eyeballs on your code and help you be thorough in your testing. In this first module, I used nodeunit and was able to catch bugs during tests and from user reports. In the next blog post, I will do a final walk through of the app with a deep dive into client-side components.

Monica Wilkinson, Cloud Foundry Team

Sign up for Cloud Foundry today to build an app in Node.js with MongoDB and Redis

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Building a Real Time Activity Stream on Cloud Foundry with Node.js, Redis and MongoDB – Part I

Cloud Foundry provides developers with several choices of frameworks, application infrastructure services and deployment clouds. This approach to providing a platform as a service enables the fast creation of useful cloud applications that take advantage of polyglot programming and multiple data services. As an example of how this enables us to be more productive, I was able to use Node.js, Redis, and MongoDB to create an Activity Streams application in a short time, despite the fact that I mainly develop Web Applications in Ruby. Based on the developer interest after a demo of this application at NodeSummit 2012 and MongoSF 2012,  I was inspired to do a 3 part blog series that fully details the creation, deployment, and scaling of this application on CloudFoundry.com.

The application is based on an interesting project I came across recently called “node-express-boilerplate” developed by @mape which is a full-featured but simple boilerplate Node.js app. Given my previous experience working on Social Networking software and Open Web Standards, I decided to morph this app into an Activity Streams sample application.

Initial Boilerplate Architecture without MongoDB or Redis PubSub

“Node-express-boilerplate” is a starter application written in Node.js which showcases how users can log into a website using Facebook, Twitter or GitHub, display basic profile info from those sites, and have real-time communication between server and clients. In this first blog post, we are going to review the components of the boilerplate application so that developers can learn how to use them for other applications and  deploy locally as well as on Cloud Foundry.

A tour of the Boilerplate Components

Before we move to the setup, here are some code highlights explaining the various parts of the application.

Express MVC Framework

Working with the Express framework is very easy to do because its creator, @tjholowaychuk, has made it simple yet flexible to use while providing good documentation and screencasts.

You can use any web templating engine. For my project, I switched from using Embedded Javascript (EJS) to jade which is even terser than HTML Abstraction Markup Language (Haml).

// Settings
app.configure(function() {
	app.set('view engine', 'jade');
	app.set('views', __dirname+'/views');
});

Route Middleware

This middleware allows us to chain functions to pre-process the request. In this example I am showing how to invoke loadUser to read the current user, getDistinctStreams to get all the different topics and getMetaData to get the list of verbs and object types.

app.get('/streams/:streamName', loadUser, getDistinctStreams, getMetaData, function(req, res) {

    asmsDB.getActivityStream(req.params.streamName, 20, function (err, docs) {
        var activities = [];
        if (!err && docs) {
            activities = docs;
        }
        req.streams[req.params.streamName].items = activities;
        res.render('index', {
            currentUser: req.user,
            providerFavicon: req.providerFavicon,
            streams : req.streams,
            desiredStream : req.session.desiredStream,
            objectTypes : req.objectTypes,
            verbs: req.verbs
        });
    });

});

Session Management

In the boilerplate application, sessions are stored in Redis, allowing us to scale to more than one instance. Not only does this application properly manage user sessions, it also supports single sign-on with  GitHub, Facebook and Twitter via module everyauth.

Packaging Assets

In real world applications today, users expect immediate feedback and this cannot be done if your client is executing multiple http requests to render the app. This boilerplate sample uses @mape‘s module connect-assetmanager which minifies and bundles CSS and Javascript assets so the responses are very fast. This module is very flexible and allows pre and post manipulation of assets.

Here is an example which packages the js and the css files

var assetManager = require('connect-assetmanager');
var assetHandler = require('connect-assetmanager-handlers');
...
var assetsSettings = {
	'js': {
		'route': /\/static\/js\/[a-z0-9]+\/.*\.js/
		, 'path': './public/js/'
		, 'dataType': 'javascript'
		, 'files': [
			'http://code.jquery.com/jquery-latest.js'
			, siteConf.uri+'/socket.io/socket.io.js' // special case since the socket.io module serves its own js
			, 'jquery.client.js'
		]
		, 'debug': true
		, 'postManipulate': {
			'^': [
				assetHandler.uglifyJsOptimize
				, function insertSocketIoPort(file, path, index, isLast, callback) {
					callback(file.replace(/.#socketIoPort#./, siteConf.port));
				}
			]
		}
	}
	, 'css': {
		'route': /\/static\/css\/[a-z0-9]+\/.*\.css/
		, 'path': './public/css/'
		, 'dataType': 'css'
		, 'files': [
			'reset.css'
			, 'client.css'
		]
		, 'debug': true
		, 'postManipulate': {
			'^': [
				assetHandler.fixVendorPrefixes
				, assetHandler.fixGradients
				, assetHandler.replaceImageRefToBase64(__dirname+'/public')
				, assetHandler.yuiCssOptimize
			]
		}
	}
};

var assetsMiddleware = assetManager(assetsSettings);

Client-Server Real Time Communication

Socket.io is used to send messages back and forth between the user-agent and the server. This saves us from having to do entire page refreshes to show new content.

In the example below you can see how we subscribe to different events and change the page content accordingly.

var socketIoClient = io.connect(null, {
		'port': '#socketIoPort#'
		, 'rememberTransport': true
		, 'transports': ['xhr-polling']
	});
	socketIoClient.on('connect', function () {
		$$('#connected').addClass('on').find('strong').text('Online');
	});

	var image = $.trim($('#image').val());
	var service = $.trim($('#service').val());

    var $ul = $('#main_stream');

	socketIoClient.on('message', function(json) {
		var doc = JSON.parse(json);
        if (doc) {
            var $li = $(jade.templates["activity"]({activities: [doc]}));
            $ul.prepend($li);
        }
		if ($ul.children.count > 20) {
            $ul.children.last.remove();
        }
	});

Here are the steps we performed to run this application as-is on Cloud Foundry

Setup

  • Get a CloudFoundry.com account if you don’t have one yet.
  • Install vmc command line tool to deploy to Cloud Foundry.
  • Get Node.js running locally on your machine.
    • Install version 0.6.8 or later. NPM is the package manager which will be included.
  • Get Redis running locally.
  • Create Apps on Facebook, Twitter and GitHub for prod and local environments and note the keys.

Configure

Clone @mape‘s repo or fork it and clone your fork and install the dependencies

$ git clone git://github.com/mape/node-express-boilerplate.git
$ cd node-express-boilerplate

Edit package.json to include module cloudfoundry

{
  "name" : "node-express-boilerplate",
  "description" : "A boilerplate used to quickly get projects going.",
  "version" : "0.0.2",
  "author" : "Mathias Pettersson ",
  "engines" : ["node"],
  "repository" : { "type":"git", "url":"http://github.com/mape/node-express-boilerplate" },
  "dependencies" : {
    "cloudfoundry": ">=0.1.0",
    "connect" : ">=1.6.0",
    "connect-assetmanager" : ">=0.0.21",
    "connect-assetmanager-handlers" : ">=0.0.17",
    "ejs" : ">=0.4.3",
    "express" : ">=2.4.3",
    "socket.io" : ">=0.7.8",
    "connect-redis" : ">=1.0.7",
    "connect-notifo" : ">=0.0.1",
    "airbrake" : ">=0.2.0",
    "everyauth" : ">=0.2.18"
  }
}

Install the dependencies locally.

$ npm install

Copy to siteConfig.js

$ cp siteConfig.sample.js siteConfig.js

Edit siteConfig.js to use environment variables and the cloudfoundry module. Changes are detailed here.

Update server.js to use the new siteConfig.js settings as seen here.

Set environment variables for all services. Example in bash:

$ export twitter_consumer_key=2SXwj3HcMHsdsdsL4uuUBdjShw
$ export twitter_consumer_secret=UFamzEOAEhLUwewewDwwEoCI72hN0fl8
$ export facebook_app_id=5925695687264066
$ export facebook_app_secret=cce6f5edefa89f4686e5e036e3ea
$ export airbrake_api_key=63340934f6b376a001eacfc660d06205
$ export github_client_id='92df9d93813ab234e1'
$ export github_client_secret='fa64d10d3a02eee08d00cda3c2965caea2a4ce22'

Run locally

$ node server.js

Run on CloudFoundry.com

Install vmc if you have not already done so

$ sudo gem install vmc --pre

Specify you want Redis “redis-asms” bound to your app when you deploy:

$ vmc push --runtime=node06 --nostart

  Would you like to deploy from the current directory? [Yn]:
  Application Name: node-express-start
  Detected a Node.js Application, is this correct? [Yn]:
  Application Deployed URL [node-express-start.cloudfoundry.com]:
  Memory reservation (128M, 256M, 512M, 1G, 2G) [64M]: 128M
  How many instances? [1]:
  Bind existing services to 'node-express-start'? [yN]:
  Create services to bind to 'node-express-start'? [yN]: Y
    1: mongodb
    2: mysql
    3: postgresql
    4: rabbitmq
    5: redis
  What kind of service?: 5
  Specify the name of the service [redis-9bea7]: redis-asms
  Create another? [yN]: N
  Would you like to save this configuration? [yN]: Y
  Manifest written to manifest.yml.
  Creating Application: OK
  Creating Service [redis-asms]: OK
  Binding Service [redis-asms]: OK
  Uploading Application:
  Checking for available resources: OK
  Processing resources: OK
  Packing application: OK
  Uploading (305K): OK
Push Status: OK

Note that here I responded Yes to saving the configuration which will create a file called manifest.yml. A manifest.yml helps you quickly push and update apps. You can read more about it here.

Now you can run this command to add the keys from the services.

$ export APP_NAME=your_app_name
$ vmc env-add $APP_NAME airbrake_api_key=your_key
$ vmc env-add $APP_NAME github_client_id=github_id
$ vmc env-add $APP_NAME github_client_secret=github_secret
$ vmc env-add $APP_NAME facebook_app_id=fb_id
$ vmc env-add $APP_NAME facebook_app_secret=fb_secret
$ vmc env-add $APP_NAME NODE_ENV=production
$ vmc env-add $APP_NAME twitter_consumer_key=twitter_key
$ vmc env-add $APP_NAME twitter_consumer_secret=twitter_secret

To finish, run:

  vmc start

And that’s it! You are up and running with the boilerplate app as seen here. Please note that the app may not work to spec on IE, but works on Firefox, Safari and Chrome.

Observations so far

node-express-boilerplate is a great starting point on which to build an activity stream engine given that it addresses:

  • Robust real-time messaging between browser and server. Socket.io adapts to the protocols supported by the server and client
  • Performance via Asset Bundling and Minification
  • Provides SSO Support to major Social Networks
  • Handles scalable session management via Redis
  • Built on a great MVC framework

Also, it was good to see that @mape had abstracted the infrastructure details via the creation of a siteConfig.js. We enhanced this even further by using environment variables.

As you saw on the walkthrough all this was possible thanks to the open source community and using a robust platform as a service like CloudFoundry which had everything I needed. I was able to use @igo‘s “cloudfoundry” module to assist in parsing environment details in my app and thus made siteConfig.js even more straightforward.

In the next blog post, I will cover how to start the modification of this app into an Activity Stream engine. The tutorial will include how to create a Node.js module like activity-streams-mongoose and how to manage the persistance of the Activity Streams data on MongoDB as well as real time syndication across multiple app instances with Redis PubSub.

Signup for Cloud Foundry today, and start building your own cool app!

-Monica Wilkinson, Cloud Foundry Team

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Running Resque Workers on Cloud Foundry

We introduced Cloud Foundry’s new “standalone” applications feature in the first post in this four-part series. In this second installment, we will look at the most common use of a standalone application–the worker process. Workers can be used for all kinds of asynchronous background jobs, such as updating search indexes, emailing all users with a password reset approaching, performing a database backup to persistent storage, or uploading new customer data from external storage. In this post, we will walk through an example of deploying workers to Cloud Foundry using Resque.

Resque Workers on Cloud Foundry

Let’s start by cloning the Resque Demo Example.

mycomp:dev$ git clone git://github.com/defunkt/resque.git
mycomp:dev$ cd resque/examples/demo
Let’s add a Gemfile to the example to ensure that Cloud Foundry can find all required gems.

source "http://rubygems.org"
gem 'sinatra'
gem 'resque'
gem 'rake'
gem 'json'

We’ll run “bundle install” and “bundle package” to package the gems in vendor/cache, and we’re ready to deploy.  The resque server is a Rack app, so we’ll deploy it to Cloud Foundry as such.

mycomp:demo$ vmc push resque-server
Would you like to deploy from the current directory? [Yn]:
Detected a Rack Application, is this correct? [Yn]:
Application Deployed URL [resque-server.cloudfoundry.com]: 
Memory reservation (128M, 256M, 512M, 1G, 2G) [128M]:
How many instances? [1]:
Create services to bind to 'resque-server'? [yN]: y
1: mongodb
2: mysql
3: postgresql
4: rabbitmq
5: redis
What kind of service?: 5
Specify the name of the service [redis-2a462]: redis-work-queue
Create another? [yN]:
Would you like to save this configuration? [yN]: y
Manifest written to manifest.yml.
Creating Application: OK
Creating Service [redis-work-queue]: OK
Binding Service [redis-work-queue]: OK
Uploading Application:
Checking for available resources: OK
Processing resources: OK
Packing application: OK
Uploading (21K): OK
Push Status: OK
Staging Application 'resque-server': OK
Starting Application 'resque-server': OK
Let’s have a look at the resque-server app and add some jobs to the queue:
Now that we have some jobs, it’s time to deploy some workers!  First, we need to rename the generated manifest.yml for the Rack app, so it won’t automatically be used in the push.  We can use it again later by doing a “vmc push –manifest server-manifest.yml”.  Now, let’s push the app again as a standalone worker app.
mycomp:demo$ mv manifest.yml server-manifest.yml
mycomp:demo$ vmc push resque-worker
Would you like to deploy from the current directory? [Yn]:
Detected a Rack Application, is this correct? [Yn]: n
1: Rails
2: Spring
3: Grails
4: Lift
5: JavaWeb
6: Standalone
7: Sinatra
8: Node
9: Rack
Select Application Type: 6
Selected Standalone Application
1: java
2: node
3: node06
4: ruby18
5: ruby19
Select Runtime [ruby18]:
Selected ruby18
Start Command: bundle exec rake VERBOSE=true QUEUE=default resque:work
Application Deployed URL [None]:
Memory reservation (128M, 256M, 512M, 1G, 2G) [128M]:
How many instances? [1]:
Bind existing services to 'resque-worker'? [yN]: y
1: redis-work-queue
Which one?: 1
Bind another? [yN]:
Create services to bind to 'resque-worker'? [yN]:
Would you like to save this configuration? [yN]: y
Manifest written to manifest.yml.
Creating Application: OK
Binding Service [redis-work-queue]: OK
Uploading Application:
Checking for available resources: OK
Processing resources: OK
Packing application: OK
Uploading (0K): OK
Push Status: OK
Staging Application 'resque-worker': OK
Starting Application 'resque-worker': OK

So we’ve pushed resque-worker as a standalone app with a Ruby runtime. We gave the command “bundle exec rake VERBOSE=true QUEUE=default resque:work” to start the worker. It is recommended to use bundle exec to ensure that all required gems are available. Since resque-worker does not have a web front-end, we selected “None” for URL.

Lastly, we bound the app to the same Redis service used by resque-server. If you’ve perused the resque demo example, you may have noticed that it is setup to connect to a local Redis service. However, we didn’t change the code before we pushed it. How will the app connect to the provisioned Redis service? Since we used the Ruby runtime provided by Cloud Foundry, the app will benefit from the new Ruby auto-reconfiguration support. Cloud Foundry will automatically replace the local Redis connection with a connection to the Redis service we bound to the application!

Let’s check the logs and see if the worker completed that job:
mycomp:demo$ vmc logs resque-worker
====> /logs/stdout.log <====

Loading Redis auto-reconfiguration.
*** Starting worker ubuntu:10245:default
Auto-reconfiguring Redis.
*** got: (Job{default} | Demo::Job | [{}])
Processed a job!
*** done: (Job{default} | Demo::Job | [{}])
*** got: (Job{default} | Demo::Job | [{}])
Processed a job!
*** done: (Job{default} | Demo::Job | [{}])
And we can verify that the worker has registered through the web interface:
We can even scale the workers up.
mycomp:demo$ vmc instances resque-worker +2
Scaling Application instances up to 3: OK
And now the web interface shows three workers:


And there you have it! We can now deploy Resque workers as standalone apps on Cloud Foundry. Clone the Cloud Foundry resque-sample and try it out for yourself!

Conclusion

Cloud Foundry now provides improved Resque support through standalone applications, as well as support for other Ruby worker libraries or apps. If you can package all the bits and provide a start command, you can run it on Cloud Foundry! In the next installment in this series, we will explore another example of workers in action using Spring integration. Stay tuned!

- Jennifer Hickey
The Cloud Foundry Team

Don’t have a Cloud Foundry account yet?  Sign up for free today

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Cloud Foundry Open PaaS Deep Dive

by Ezra Zygmuntowicz (aka @ezmobius)

You are probably wondering about how Cloud Foundry actually works, hopefully these details will clear things up for you about how Cloud Foundry the OSS project works, why it works, and how you can use it. Cloud Foundry is on github here: https://github.com/cloudfoundry/vcap. The VCAP repo is the meaty part or what we call the “kernel” of Cloud Foundry as it is the distributed system that contains all the functionality of the PaaS. We have released a VCAP setup script that will help you get an Ubuntu 10.04 system running a instance of Cloud Foundry including all the components of VCAP as well as a few services (mysql, redis, mongodb) up and running so you can play along at home.

We want to build a community around Cloud Foundry, as that is what matters most for now as far as the open source project. We imagine a whole ecosystem of “userland” tools that people can create and plug into our Cloud Foundry kernel to add value and customize for any particular situation. This project is so large in scope that we had to cut a release and get community involvement at some point and we feel that the kernel is in great shape for everyone to dig in and start helping us shape the future of the “linux kernel for the cloud” ;)

So how do you approach building a PaaS (Platform as a Service) that is capable of running multiple languages under a common deployment and scalability idiom, that is also capable of running on any cloud or any hardware that you can run Ubuntu on? VCAP  was architected by my true rock star coworker Derek Collison (this guys is the man, for realz!). The design is very nice and adheres to a main core concept of “the closer to the center of the system the dumber the code should be”. Distributed systems are fundamentally hard problems. So each component that cooperates to form the entire system should be as simple as it possibly can be and still do its job properly.

VCAP is broken down into 5 main components plus a message bus: The Cloud Controller, the Health Manager, the Router’s, the DEAs (Droplet Execution Agents) and a set of Services. Let’s take a closer look at each component in turn and see how they fit together to form a full platform or for the cloud. NATS is a very simple pub/sub messaging system written by Derek Collison (this dude knows messaging, trust me;) of TIBCO fame. NATS is the system that all the internal components communicate on. While NATS is great for this sort of system communication, you would never use NATS for application level messaging. VMware’s own RabbitMQ is awesome for app level messaging and we plan to make that available to Cloud Foundry users in the near future.

It should be stated here that every component in this system is horizontally scalable and self healing, meaning you can add as many copies of each component as needed in order to support the load of your cloud, and in any order. Since everything is decoupled, it doesn’t even really matter where each component lives, things could be spread across the world for all it cares. I think this is pretty cool ;)

Cloud Controller

The Cloud Controller is the main ‘brain’ of the system. This is an Async Rails3 app that uses ruby 1.9.2 and fibers plus eventmachine to be fully async, pretty cutting edge stuff. You can find the Cloud Controller here: https://github.com/cloudfoundry/vcap/tree/master/cloud_controller . This component exposes the main REST interface that the CLI tool “vmc” talks to as well as the STS plugin for eclipse. If you were so inclined you could write your own client that talks to the REST endpoints exposed by the Cloud Controller to talk to VCAP in whatever way you like. But this should not be necessary as the “vmc” CLI has been written with scriptability in mind. It will return proper exit codes as well as JSON if you so desire so you can fully script it with bash or Ruby, Python, etc.

We made a decision not to tie VCAP to git even though we love git, we need to support any source code control system, yet we want the simplicity of a git push style deployment, hence the vmc push. But we also do want to have the differential deploys, meaning that we want to push diffs when you update your app, we do not want to have to push your entire app tree every time you deploy. Feeling light and fast is important to us. Our goal is to rival local development.

So we designed a system where we can get the best of both worlds. You as a user can use any source control system you want, when you do a vmc push or vmc update we will examine your app’s directory tree and send a “skeleton fingerprint” to the cloud controller. This is basically a fingerprint of each file in your apps tree and the shape of your directory tree. The cloud controller keeps these in a shared pool, accessible via their fingerprint plus the size for every object it ever sees. Then it returns to the client a manifest of what files it already has and what files it needs your client to send to the cloud in order to have all of your app. It is a sort of ‘dedupe’ for app code as well as framework and rubygem code and other dependency code. Then your client only sends the objects that the cloud requires in order to create a full “Droplet” (a droplet is a tarball of all your app code plus its dependencies all wrapped up into a droplet with a start and stop button) of your application.

Once the Cloud Controller has all the ‘bits’ it needs to fully assemble your app, it pushes the app into the “staging pipeline”. Staging is what we call the process that assembles your app into a droplet by getting all the full objects that comprise your applications plus all of its dependencies, rewrites its config files in order to point to the proper services that you have bound to your application and then creates a tarball with some wrapper scripts called start and stop.

DEA

The Droplet Execution Agent can be found here: https://github.com/cloudfoundry/vcap/tree/master/dea . This is an agent that is run on each node in the grid that actually runs the applications. So in any particular cloud build of Cloud Foundry, you will have more DEA nodes then any other type of node in a typical setup. Each DEA can be configured to advertise a different capacity and different runtime set,  so you do not need all your DEA nodes to be the same size or be able to run the same applications. So continuing on from our Cloud Controller story, the CC has asked for help running a droplet by broadcasting on the bus that it has a droplet that needs to be run. This droplet has some meta data attached to it like what runtime it needs as well as how much RAM it needs. Runtimes are the base component needed to run your app, like ruby-1.8.7 or ruby-1.9.2, or java6, or node. When a DEA gets one of these messages he checks to make sure he can fulfill the request and if he can he responds back to the Cloud Controller that he is willing to help.

The DEA does not necessarily care what language an app is written in. All it sees are droplets. A droplet is a simple wrapper around an application that takes one input, the port number to serve HTTP requests on. And it also has 2 buttons, start and stop. So the DEA treats droplets as black boxes, when it receives a new droplet to run, all it does it tells it what port to bind to and runs the start script. A droplet again is just a tarball of your application, wrapped up in a start/stop script and with all your config files like database.yml rewritten in order to bind to the proper database. Now we can’t rewrite configs for every type of service so for some services like Redis or Mongodb you will need to grab your configuration info from the environment variable ENV[‘VCAP_SERVICES’].

In fact there is a bunch of juicy info in the ENV of your application container. If you create a directory on your laptop and make a file in it called env.rb like this:

$ mkdir env && cd env 
$ cat «EOF > envvars.rb 
require ‘sinatra’ 
require ‘pp’ 
get ‘/’ do   
  “#{ENV.pretty_inspect}”
end 
EOF
$ vmc push …

That will make a simple app that will show you what is available in your ENVIRONMENT so that you can see what to use to configure your application. If you visit this new app you will see something like this: ENV output.

So the DEA’s job is almost done, once it tells the droplet what port to listen on for HTTP requests and runs it’s start script, and the app properly binds the correct port, it will broadcast on the bus the location of the new application so the Routers can know about it. If the app did not start successfully it will return log messages to the vmc client that tried to push this app telling the user why their app didn’t start (hopefully). This leads us right into what the Router has to do in the system so we will hand it over to the Router (applause).

Router

The Router is another eventmachine daemon that does what you would think it does, it routes requests. In a larger production setup there is a pool of Routers load balanced behind Nginx (or some other load balancer or http cache/balancer). These routers listen on the bus for notifications from the DEA’s about new apps coming online and apps going offline etc. When they get a realtime update they will update their in-memory routing table that they consult in order to properly route requests. So a request coming into the system goes through Nginx, or some other HTTP termination endpoint, which then load balances across a pool of identical Routers. One of the routers will pick up the phone to answer the request, it will start inspecting the headers of the request just enough to find the Host: header so it can pick out the name of the application this request is headed for. It will then do a basic hash lookup in the routing table to find a list of potential backends that represent this particular application. These look like: {‘foo.cloudfoundry.com’ => [‘10.10.42.1:5897’, ‘10.10.42.3:61378’, etc]}

So once it has looked up the list of currently running instances of the app represented by ‘foo.cloudfoundry.com’ it will pick a random backend to send the request to. So the router chooses one backend instance and forwards the request to that app instance. Responses are also inspected at the header level for injection if need be for functionality such as sticky sessions.

The Router can retry another backend if the one it chose fails and there are many ways to customize this behavior if you have your own instance of Cloud Foundry setup somewhere. Routers are fairly straightforward in what they do and how they do it. They are eventmachine based and run on ruby-1.9.2 so they are fast and can handle quite a bit of traffic per instance, but like every other component in the system, they are horizontally scalable and you can add more as needed in order to scale up bigger and bigger. The system is architected in such a way that this can even be done on a running system.

Health Manager

The Health Manager is a standalone daemon that has a copy of the same models the Cloud Controller has and can currently see into the same database as the Cloud Controller. This daemon has an interval where it wakes up and scans the database of the Cloud Controller to see what the state of the world “should be”, then it actually goes out into the world and inspects the real state to make sure it matches. If there are things that do not match then it will initiate messages back to the Cloud Controller to correct this. This is how we handle the loss of an application or even a DEA node per say. If an application goes down, the Health Manager will notice and will quickly remedy the situation by signaling the Cloud Controller to star a new instance. If a DEA node completely fails, the app instances running there will be redistributed back out across the grid of remaining DEA nodes.

Services

These are the services your application can choose to use and bind to in order to get data, messaging, caching and other services. This is currently where redis and mysql run and will eventually become a huge ecosystem of services offered by VMware and anyone else who wants to offer their service into our cloud. One of the cool things I will highlight is that you can share service instances between your different apps deployed onto a VCAP system. Meaning you could have a core java Spring app with a bunch of satellite sinatra apps communicating via redis or a RabbitMQ message bus. Ok my fingers are tired and my shoulder hurts so I am going to call this first post done. I plan on blogging a lot more often as well as trying to help organize the community around Cloud Foundry the open source project. I hope you are as excited as I am by this project, it is basically like “rack for frameworks, clouds and services” rather then just ruby frameworks. Pluggable across the 3 different axis and well tested and well coded. This thing is very cool and I am very proud just to be a member of the team working on this thing. This has been a huge team effort to get out the door and we hope it will become a huge community effort to keep driving it forward to truly make it “The Linux Kernel of Cloud Operating Systems”. Will you please join the community with me and help build this thing out to meet its true potential?

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email