Cloud Foundry Blog

Cloud Foundry Open PaaS Deep Dive

by Ezra Zygmuntowicz (aka @ezmobius)

You are probably wondering about how Cloud Foundry actually works, hopefully these details will clear things up for you about how Cloud Foundry the OSS project works, why it works, and how you can use it. Cloud Foundry is on github here: https://github.com/cloudfoundry/vcap. The VCAP repo is the meaty part or what we call the “kernel” of Cloud Foundry as it is the distributed system that contains all the functionality of the PaaS. We have released a VCAP setup script that will help you get an Ubuntu 10.04 system running a instance of Cloud Foundry including all the components of VCAP as well as a few services (mysql, redis, mongodb) up and running so you can play along at home.

We want to build a community around Cloud Foundry, as that is what matters most for now as far as the open source project. We imagine a whole ecosystem of “userland” tools that people can create and plug into our Cloud Foundry kernel to add value and customize for any particular situation. This project is so large in scope that we had to cut a release and get community involvement at some point and we feel that the kernel is in great shape for everyone to dig in and start helping us shape the future of the “linux kernel for the cloud” ;)

So how do you approach building a PaaS (Platform as a Service) that is capable of running multiple languages under a common deployment and scalability idiom, that is also capable of running on any cloud or any hardware that you can run Ubuntu on? VCAP  was architected by my true rock star coworker Derek Collison (this guys is the man, for realz!). The design is very nice and adheres to a main core concept of “the closer to the center of the system the dumber the code should be”. Distributed systems are fundamentally hard problems. So each component that cooperates to form the entire system should be as simple as it possibly can be and still do its job properly.

VCAP is broken down into 5 main components plus a message bus: The Cloud Controller, the Health Manager, the Router’s, the DEAs (Droplet Execution Agents) and a set of Services. Let’s take a closer look at each component in turn and see how they fit together to form a full platform or for the cloud. NATS is a very simple pub/sub messaging system written by Derek Collison (this dude knows messaging, trust me;) of TIBCO fame. NATS is the system that all the internal components communicate on. While NATS is great for this sort of system communication, you would never use NATS for application level messaging. VMware’s own RabbitMQ is awesome for app level messaging and we plan to make that available to Cloud Foundry users in the near future.

It should be stated here that every component in this system is horizontally scalable and self healing, meaning you can add as many copies of each component as needed in order to support the load of your cloud, and in any order. Since everything is decoupled, it doesn’t even really matter where each component lives, things could be spread across the world for all it cares. I think this is pretty cool ;)

Cloud Controller

The Cloud Controller is the main ‘brain’ of the system. This is an Async Rails3 app that uses ruby 1.9.2 and fibers plus eventmachine to be fully async, pretty cutting edge stuff. You can find the Cloud Controller here: https://github.com/cloudfoundry/vcap/tree/master/cloud_controller . This component exposes the main REST interface that the CLI tool “vmc” talks to as well as the STS plugin for eclipse. If you were so inclined you could write your own client that talks to the REST endpoints exposed by the Cloud Controller to talk to VCAP in whatever way you like. But this should not be necessary as the “vmc” CLI has been written with scriptability in mind. It will return proper exit codes as well as JSON if you so desire so you can fully script it with bash or Ruby, Python, etc.

We made a decision not to tie VCAP to git even though we love git, we need to support any source code control system, yet we want the simplicity of a git push style deployment, hence the vmc push. But we also do want to have the differential deploys, meaning that we want to push diffs when you update your app, we do not want to have to push your entire app tree every time you deploy. Feeling light and fast is important to us. Our goal is to rival local development.

So we designed a system where we can get the best of both worlds. You as a user can use any source control system you want, when you do a vmc push or vmc update we will examine your app’s directory tree and send a “skeleton fingerprint” to the cloud controller. This is basically a fingerprint of each file in your apps tree and the shape of your directory tree. The cloud controller keeps these in a shared pool, accessible via their fingerprint plus the size for every object it ever sees. Then it returns to the client a manifest of what files it already has and what files it needs your client to send to the cloud in order to have all of your app. It is a sort of ‘dedupe’ for app code as well as framework and rubygem code and other dependency code. Then your client only sends the objects that the cloud requires in order to create a full “Droplet” (a droplet is a tarball of all your app code plus its dependencies all wrapped up into a droplet with a start and stop button) of your application.

Once the Cloud Controller has all the ‘bits’ it needs to fully assemble your app, it pushes the app into the “staging pipeline”. Staging is what we call the process that assembles your app into a droplet by getting all the full objects that comprise your applications plus all of its dependencies, rewrites its config files in order to point to the proper services that you have bound to your application and then creates a tarball with some wrapper scripts called start and stop.

DEA

The Droplet Execution Agent can be found here: https://github.com/cloudfoundry/vcap/tree/master/dea . This is an agent that is run on each node in the grid that actually runs the applications. So in any particular cloud build of Cloud Foundry, you will have more DEA nodes then any other type of node in a typical setup. Each DEA can be configured to advertise a different capacity and different runtime set,  so you do not need all your DEA nodes to be the same size or be able to run the same applications. So continuing on from our Cloud Controller story, the CC has asked for help running a droplet by broadcasting on the bus that it has a droplet that needs to be run. This droplet has some meta data attached to it like what runtime it needs as well as how much RAM it needs. Runtimes are the base component needed to run your app, like ruby-1.8.7 or ruby-1.9.2, or java6, or node. When a DEA gets one of these messages he checks to make sure he can fulfill the request and if he can he responds back to the Cloud Controller that he is willing to help.

The DEA does not necessarily care what language an app is written in. All it sees are droplets. A droplet is a simple wrapper around an application that takes one input, the port number to serve HTTP requests on. And it also has 2 buttons, start and stop. So the DEA treats droplets as black boxes, when it receives a new droplet to run, all it does it tells it what port to bind to and runs the start script. A droplet again is just a tarball of your application, wrapped up in a start/stop script and with all your config files like database.yml rewritten in order to bind to the proper database. Now we can’t rewrite configs for every type of service so for some services like Redis or Mongodb you will need to grab your configuration info from the environment variable ENV[‘VCAP_SERVICES’].

In fact there is a bunch of juicy info in the ENV of your application container. If you create a directory on your laptop and make a file in it called env.rb like this:

$ mkdir env && cd env 
$ cat «EOF > envvars.rb 
require ‘sinatra’ 
require ‘pp’ 
get ‘/’ do   
  “#{ENV.pretty_inspect}”
end 
EOF
$ vmc push …

That will make a simple app that will show you what is available in your ENVIRONMENT so that you can see what to use to configure your application. If you visit this new app you will see something like this: ENV output.

So the DEA’s job is almost done, once it tells the droplet what port to listen on for HTTP requests and runs it’s start script, and the app properly binds the correct port, it will broadcast on the bus the location of the new application so the Routers can know about it. If the app did not start successfully it will return log messages to the vmc client that tried to push this app telling the user why their app didn’t start (hopefully). This leads us right into what the Router has to do in the system so we will hand it over to the Router (applause).

Router

The Router is another eventmachine daemon that does what you would think it does, it routes requests. In a larger production setup there is a pool of Routers load balanced behind Nginx (or some other load balancer or http cache/balancer). These routers listen on the bus for notifications from the DEA’s about new apps coming online and apps going offline etc. When they get a realtime update they will update their in-memory routing table that they consult in order to properly route requests. So a request coming into the system goes through Nginx, or some other HTTP termination endpoint, which then load balances across a pool of identical Routers. One of the routers will pick up the phone to answer the request, it will start inspecting the headers of the request just enough to find the Host: header so it can pick out the name of the application this request is headed for. It will then do a basic hash lookup in the routing table to find a list of potential backends that represent this particular application. These look like: {‘foo.cloudfoundry.com’ => [‘10.10.42.1:5897’, ‘10.10.42.3:61378’, etc]}

So once it has looked up the list of currently running instances of the app represented by ‘foo.cloudfoundry.com’ it will pick a random backend to send the request to. So the router chooses one backend instance and forwards the request to that app instance. Responses are also inspected at the header level for injection if need be for functionality such as sticky sessions.

The Router can retry another backend if the one it chose fails and there are many ways to customize this behavior if you have your own instance of Cloud Foundry setup somewhere. Routers are fairly straightforward in what they do and how they do it. They are eventmachine based and run on ruby-1.9.2 so they are fast and can handle quite a bit of traffic per instance, but like every other component in the system, they are horizontally scalable and you can add more as needed in order to scale up bigger and bigger. The system is architected in such a way that this can even be done on a running system.

Health Manager

The Health Manager is a standalone daemon that has a copy of the same models the Cloud Controller has and can currently see into the same database as the Cloud Controller. This daemon has an interval where it wakes up and scans the database of the Cloud Controller to see what the state of the world “should be”, then it actually goes out into the world and inspects the real state to make sure it matches. If there are things that do not match then it will initiate messages back to the Cloud Controller to correct this. This is how we handle the loss of an application or even a DEA node per say. If an application goes down, the Health Manager will notice and will quickly remedy the situation by signaling the Cloud Controller to star a new instance. If a DEA node completely fails, the app instances running there will be redistributed back out across the grid of remaining DEA nodes.

Services

These are the services your application can choose to use and bind to in order to get data, messaging, caching and other services. This is currently where redis and mysql run and will eventually become a huge ecosystem of services offered by VMware and anyone else who wants to offer their service into our cloud. One of the cool things I will highlight is that you can share service instances between your different apps deployed onto a VCAP system. Meaning you could have a core java Spring app with a bunch of satellite sinatra apps communicating via redis or a RabbitMQ message bus. Ok my fingers are tired and my shoulder hurts so I am going to call this first post done. I plan on blogging a lot more often as well as trying to help organize the community around Cloud Foundry the open source project. I hope you are as excited as I am by this project, it is basically like “rack for frameworks, clouds and services” rather then just ruby frameworks. Pluggable across the 3 different axis and well tested and well coded. This thing is very cool and I am very proud just to be a member of the team working on this thing. This has been a huge team effort to get out the door and we hope it will become a huge community effort to keep driving it forward to truly make it “The Linux Kernel of Cloud Operating Systems”. Will you please join the community with me and help build this thing out to meet its true potential?

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

What Happens When You vmc push an Application to Cloud Foundry

This post covers the Cloud Foundry vmc CLI interface and how it interacts with Cloud Foundry.  There will be another post covering what Cloud Foundry does on the back-end when clients (such as vmc or STS) connect to it coming soon.

Targeting Cloud Foundry

Targeting Cloud Foundry

Step 1 : vmc target api.cloudfoundry.com

When you first install vmc and are ready to start controlling Cloud Foundry, you will need to first select it as a target.  Why do you need to select api.cloudfoundry.com ? Because vmc is capable of connecting to any Cloud Foundry instance whether it be at cloudfoundry.com or elsewhere.  Selecting a target also allows you to use the same CLI to interface with multiple Cloud Foundry Clouds in the same way.

Step 2 : vmc returns “Successfully targeted to [http://api.cloudfoundry.com]

This indicates that vmc was able to get a valid response from api.cloudfoundry.com, meaning that it is a valid Cloud Foundry instance.

NOTE: The diagram numbers repeat each time a new command is issued through vmc

Logging into Cloud Foundry

Step 1 : vmc login

After typing the command, you will be prompted enter your email address and password.  Your credentials are then passed via. vmc to Cloud Foundry where they are validated.  Assuming that your credentials are validated, Cloud Foundry then issues a Security Token back to the vmc client.

Step 2 : Successfully logged into [http://api.cloudfoundry.com]

This is displayed once the vmc client has received the Security Token.  You are now able to issue commands to control your instances on Cloud Foundry through vmc.

Using vmc push to Deploy an Application

Step 1 : vmc push

The follow on steps go into the details of the Dialog displayed above.  Going through what each step is and what it does.

Step 2 : This allows you to select an alternative location for your application. When using vmc push you can add —path /some/location this allows you to stay in a different directory context than the application that you are deploying.

Step 3 : Naming your application gives you the ability to use a simpler friendly name when referencing it inside of vmc.  This name can (but doesn’t have to) be disconnected from the Deployed URL where users will access your application.

Step 4 : Application Deployed URL is where users will access your application the internet or outside of your Cloud Foundry instance.

Step 5 : Detecting or Selecting Application Types

vmc attempts to detect the framework that is in use, which allows it do decide what the correct run-time environment that is needed will be.  If you are doing something where you want your code treated a specific way or if vmc cannot detect the framework being used, you can manually select the framework and this will cause your code to be treated as code for both this framework and the associated run-time of that framework.

Step 6 : The Memory Reservation is used to allow you to determine how much memory your application will need.  The Memory Reservation is part of the manifest which will be described later and it is also used to ensure that you are staying within your quota limits of the system.

Step 7 : vmc has Created the Application LOCALLY, this means that all of the application metadata has been created and saved.  Application Metadata makes up the manifest and includes :

  • Application Name
  • URL
  • Framework
  • Run-time
  • Instances Needed
  • Memory Reservation

It is important to realize that no bits have been sent to Cloud Foundry yet.

Step 8 : Binding Services to your Application offers the ability to add capabilities such as Redis (a Key/Value Store), MySQL (a Relational Database), along with other services such as messaging.

Step 9 : Because your application is being run remotely (not on your local OS) it has to be preprocessed before it is sent to Cloud Foundry.  The next few steps will walk through the sophisticated process by which vmc and Cloud Foundry work together to find the most efficient way to upload your code.

Step 10 : Checking for available resources is examining the metadata and manifest to decide what it needs to send.  The process that vmc follows to decide what it should send to Cloud Foundry follows a multi-stage process, which is described in the next few diagrams.

Step 10a : The Metadata is examined (what resources the application needs, what runtime and framework it is supposed to use, etc.)

Step 10b : Look at Components (Decide what gems, libraries, npms, etc. are used in preparation for the next step.

Step 10c : File based Fingerprinting (This is where a Manifest of files is built along with a SHA-1 hash associated with that file).

Step 10d : The manifest is then sent to Cloud Foundry.

Step 11 : The Application has been successfully packaged, what is packaging? See the next diagram for details (and what happened with the manifest sent in Step 10)

Step 11a : When packing begins, a manifest is sent from Cloud Foundry to vmc.  This manifest lists only the files that Cloud Foundry needs, not all of the files that make up the application.  This method of only requesting and sending what is needed makes uploading far more efficient than if all code/application files were sent with each push.

Step 11b : vmc copies the necessary files (only those that Cloud Foundry needs) and compresses them

Step 11c : The vmc manifest is prepared along with the compressed file containing all of the code needed by Cloud Foundry.

Step 11d : The file is ready to be uploaded to Cloud Foundry.

Step 12 : vmc Uploads the application (just the manifest and compressed file) to Cloud Foundry.  In the above case you could see 0K uploaded if you are deploying an application that someone else has deployed previously.  This is because Cloud Foundry doesn’t just look at files that you have sent it, but all files that it has seen previously.  By looking at all previous files, the upload efficiency can be increased because there are more likely to be files that Cloud Foundry has seen before.

Step 13 : When Push Status returns OK, the application has been received by Cloud Foundry and can now be used for Staging.

Step 14 : Staging Application does not return okay until Cloud Foundry has successfully allocated the resources and environment for your application to run.

Step 15 : Starting Application is not required but will happen automatically unless you add —no-start to the command line when doing a vmc push.  Starting Application will not return OK until vmc receives information from Cloud Foundry that all Application instances are healthy/running.

Once your application has started successfully, you can visit the Deployed URL http://hello.cloudfoundry.com in this case to use the application that was deployed.

The Cloud Foundry Team has made many efforts to make resource usage and efficiency top priorities in the code and the system’s design.

More posts to follow on how Cloud Foundry works.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Explaining The Magic Triangle

 


Understanding what Cloud Foundry is all about : CHOICE

Being an Open Platform as a Service is about having the ability to make the choices that best fit you as a developer:

Choice of Developer Frameworks (The Top of the Triangle)

Today (In the initial release) Cloud Foundry Supports Spring for Java, Rails and Sinatra for Ruby, and Node.js. There is also support for Grails on Groovy and other JVM-based frameworks baked into Cloud Foundry. It is important to realize that this is only the beginning; there will be support for other frameworks (and languages) as Cloud Foundry matures.

Choice of Application Services (The Left Side of the Triangle)

Application Services allow Developers to take advantage of data, messaging, and web services as building blocks for their applications. Cloud Foundry currently offers support for MySQL, MongoDB and Redis with other service integrations underway. Examples of additional service integrations will include VMware’s vFabric application services.

Choice of Clouds (The Right Side of the Triangle)

Public, Private, VMware based and non-VMware based it up to the developer and organization as to where they want to run Cloud Foundry. Cloud Foundry can be run on Public and Private clouds because it can run on top of vSphere and vCloud Infrastructure. Cloud Foundry also runs on other platforms as RightScale demonstrated at the launch when they deployed Cloud Foundry on top of Amazon Web Services.

Choice of Usage (It’s Open Source)

Cloud Foundry’s code is open sourced at Cloud Foundry.org under the Apache 2 License making it easy for anyone to adopt and use the technology in virtually any way they want. This is one of the best ways to avoid the risk of lock-in and foster additional innovation.

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Hello World

Welcome to the Cloud Foundry blog.  You can read more about the scope of Cloud Foundry, the industry’s first open platform as a service here and here.

Simply put, our goal is to remove the obstacles developers face in building, deploying, running and scaling applications.  And do it in an open way so there is no lock-in to frameworks, application services or clouds.

Please sign up for Cloud Foundry here.  By signing up, you will get an invitation to use the service (it is first come, first served as we scale the service) and notification when you can download your own Micro Cloud that lets you run Cloud Foundry on your own desktop.

Please follow this blog’s feed to receive updates on Cloud Foundry going forward.  And you can follow us on Twitter @CloudFoundry

Happy clouding,

The Cloud Foundry Team

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email