Cloud Foundry Blog

Using the New Scripted JavaScript Editor for Node.js Development

This week VMware released the Scripted code editor on GitHub: https://github.com/scripted-editor/scripted.

Scripted, a JavaScript editor from VMware, is a general purpose code editor intended to be very lightweight with an initial focus on giving a great JavaScript editing experience– particularly around content assist and awareness of module systems. It is a browser-based editor that runs locally on a developer’s machine with a Node.js instance serving the editor code and performing the editor operations. The only pre-req for running Scripted is that you have a recent version of Node.js installed. Scripted is implemented in 100% JavaScript, HTML and CSS. If you are interested in more background on Scripted, you can read more about it on the SpringSource.org blog.

Features

  • Fast startup, lightweight
  • Syntax highlighting for JavaScript, HTML and CSS
  • Errors and warnings:
    • JSLint is integrated to provide error/warning markers on JavaScript code
    • AMD and CommonJS module resolution: There is basic resolution where unresolved references will be marked as errors
  • Content assist:
    • Basic content assist for HTML, CSS
    • For JavaScript, content assist is driven by a type inferencing engine which is aware of AMD/CommonJS module dependencies and also uses JSDoc comments to help it understand the code
  • Hovers: Hovering over a JavaScript identifier will bring up the inferred type signature
  • Navigation: Press F8 on an identifier (that the inferencer has recognized) and the editor will navigate to the declaration. This also works on module identifiers (e.g., in define() clauses)
  • Formatting: JSbeautify is integrated
  • Sidepanel: Alongside the main editor, a sidepanel can be opened. Currently this can be used to host a second editor
  • Key binding to external command: Key bindings in the editor can invoke external commands (less, mvn, etc.)

There is much more detail on these features in the wiki documentation.

For Node.js Development

As listed in the features above, Scripted understands the CommonJS module system, as employed by Node.js apps. Understanding modules means two key things:

  • References to non-existent modules can be reported at editing time.
  • Knowing the module, we can look inside and from the contents offer appropriate content assist where the module is being used.

The following screenshot shows Scripted being used on a Node.js module and in this case an invalid module reference is being reported:

In that same piece of code, here we can see that because we recognized the module, content assist is correctly proposing the two methods from the users module, called getUser(id) and getUserCount():

The keen eyed amongst you will notice that in the content assist proposals the return value of getUser(id) was shown as a Person. This was inferred by using JSDoc that was attached to the definition of getUser(id) in the users module:

Knowing the return value then enables smart content assist at the location where the return value of the function is referenced.

For an integrated experience with Cloud Foundry, you can use the key binding configurations for Scripted to cause a vmc operation to execute from the editor. Configuration of Scripted is done through a .scripted file at the root of the project, a bit like a .virc file for vi. This file is a JSON format document, and the supported configuration options are covered in the documentation (see the section on configuration here). Using the exec-keys config option, it is possible to connect a key binding to invocation of a command. Here is some vmc related configuration:

{
    "exec": {
        "onKeys": {
            "ctrl+shift+alt+p": {
                "name": "vmc push scriptedapp",
                "cmd": "vmc push -n scriptedapp --runtime node08 --path .",
                "timeout":60000
            },
            "ctrl+shift+alt+u": {
                "name": "vmc update scriptedapp",
                "cmd": "vmc update -n scriptedapp",
                "timeout":60000
            }
        }
    }
}

In this next screenshot you can see:

  • On the right the help panel is open and the new key bindings are listed for our vmc operations.
  • The command output currently goes to the JS console (this will be improved!). The console here is showing the command has been invoked and the push was successful.

Want To Try It Out?

If you wish to try it out for yourself, arm yourself with a copy of Node.js then jump onto the Scripted GitHub page for instructions on how to get started.

Future Plans

Over the next few months we are going to focus on a few things:

  • Even smarter content assist and improved navigation options.
  • More side panel contents. This is something I haven’t focused on in this article but it is shown in the video available on the project homepage. The side panel is intended to host information relevant to the task you are trying to achieve in the main editor. This might be code, documentation, search results or a preview. Watch the video to see the side panel in action.
  • A plugin system for extending Scripted. Plugins, like Scripted itself, will be 100% JavaScript, HTML and CSS.
  • Debugging integration. Exploring integration with tools like Chrome Dev Tools and Node.js inspector.

We open-sourced Scripted to accelerate adoption and collect feedback. If you want to help us shape the editor, please join in the discussion. There is a scripted-dev Google Group for discussing it and a jira issuetracker for logging bugs, enhancement requests and voting on existing issues to ensure they are prioritized appropriately. If you want to start hacking on the codebase yourself we are definitely open to submissions–see the GitHub page for more information. The codebase is very new at the moment so there isn’t really a steep learning curve.

Please try it out! https://github.com/scripted-editor/scripted.

Andy Clement

Andy Clement is a staff engineer in the SpringSource division of VMware, based in the languages and tools lab in Vancouver. He has more than ten years experience in Enterprise Application Development and now spends his time building tools for languages like AspectJ, Groovy and JavaScript and frameworks like Grails. He currently oversees the Groovy Grails Tool Suite deliverable, a variant of the Spring Tool Suite with a focus on Groovy and Grails.

 

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Securing RESTful Web Services with OAuth2

As an active committer on Spring Security OAuth and the Cloud Foundry UAA, one of the questions I get asked the most is: “When and why would I use OAuth2?” The answer, as often with such questions, is “it depends.” However, I must admit, there are some features of OAuth2 that make it compelling in a wide variety of situations, especially in systems composed of many lightweight web services. This article guides you through updating a system to be secured with OAuth2 and the decision points for choosing to build such a system.

There is a strong trend at the moment towards distributed systems with lightweight architectures based on plain text web services (usually JSON). In this article we concentrate on these services and the systems they are part of, and look at some options for their basic security needs. An example of such a system is the open platform as a service, Cloud Foundry, in which the UAA acts as an OAuth2 provider. Using Cloud Foundry as an example also indicates that the trend in lightweight services is driven by a related trend towards cloud-based platforms for application deployment, both in the Internet at large and in the enterprise.

I recently published a blog post that provides an overview of the UAA: Introducing the UAA and Security for Cloud Foundry. The remainder of this article is about securing REST services in with OAuth generically and not necessarily specifically to the Cloud Foundry UAA, but I will use the UAA as an example to illustrate how it works.

What is a Lightweight Service?

The kind of service that is building the new architectures typically has some or all of these features:

  • HTTP transport. Cheap and easy to deploy, and supported by all cloud platforms.
  • Text-based message content, usually JSON. The emphasis is on readability and sometimes providing easy support to front-end browser and mobile developers.
  • Small, compact messages, and quick responses. Although there is a strong parallel trend of services with streaming responses for “push” data, those are generally not implemented the same way, and HTTP is not the best transport in those cases.
  • REST-ful, or at least inspired by the REST architectural style.
  • Some degree of statelessness, either ‘share-nothing,’ or very careful use of server-side state, to enable horizontal scalability.
  • Interoperability. The consumers of the service might be a dynamic population, completely unknown or unknowable at design time, using a variety of platforms and languages.

No two people or organizations will agree on what to call these services. So, although we need a working definition, it isn’t important what the precise details are. Roy Fielding’s academic work on REST is very influential, but religious wars are fought (and never won) on what it should mean in practice. To avoid being drawn into that, I will be deliberately inclusive and just refer to the services as lightweight services over HTTP, or just “lightweight services.”

What Are the Security Requirements for a Lightweight Service?

It is possible, but unlikely, that a service might be so lightweight that it doesn’t require any security, for example if it is purely informational. If security is required, it will be for the usual reasons, i.e., identity and permission; knowing who is asking for a resource, and calculating if they are permitted to use it. Identity and permission are often known as “authentication” and “authorization” (especially in Spring Security), but using those terms would cause confusion later because “authorization” is part of the OAuth2 domain language and means something different there. “Authorization” is also the name of a standard HTTP header, which is also a bit different as it is more about transporting of data than calculating permission.

So the basic requirements are identity and permissions, and the choices all come down to these nuts and bolts questions:

  • how are identity and permission information conveyed to a service?
  • how is this information decoded and interpreted?
  • what data is needed to make the access decision (user accounts, roles, ACLs, etc.)?
  • how is the data managed, who is responsible for storing and retrieving it?

The rest of this article deals with those questions using a couple of example approaches, including OAuth2, but does not attempt to be a complete description of available features and implementations even in that case.

HTTP Basic Authentication

For a simple system based on shared secrets (e.g., username and password for user accounts), HTTP Basic is something of a lowest common denominator. It is supported on practically all servers natively out of the box and there is ubiquitous support on the client side in all languages. Most, if not all clients, even interpret the authority section of a URL as data to be used for Basic authentication.

As a client, the only thing you need to do for Basic authentication is to include an Authorization header in an HTTP request, composed of the username and password, separated by a colon and then Base64 encoded. E.g., in Ruby (1.9) using RestClient:

require 'restclient'
require 'base64'
auth = "Basic " + Base64::strict_encode64("#{username}:#{password}") 
response = RestClient.get("https://myhost/resource",:authorization=>auth)

On a command line with curl you could do this:

$ curl "https://$username:$password@myhost/resource"

This looks, and is, extremely simple to implement. The only difficulty is in where you get the credentials from (the username and password). If your client is a web application, which is very common for these lightweight services, you might collect the credentials from a user in a simple HTML form.

As long as you ensure that all requests are protected by a secure socket layer, Basic authentication is fine for systems where all the participants can easily share secrets securely. If there is only one resource server (a lightweight service, or a user facing application) this isn’t difficult because the data only needs to be accessed by one component and can even be managed in config files if they are static. For example, the UAA uses Basic authentication for endpoints that are intended to be accessed only by other components in the platform kernel (i.e., machine clients). Some of those endpoints (e.g. /varz) have a password stored in a config file, and some (e.g. /oauth/token) use a database backend to store credentials. For explanations of what those endpoints do in the UAA see the introductory blog or the UAA docs.

User or Client Roles and Permissions

For very simple use cases, authentication is the only form of permission needed. If you can authenticate you can access the resource, otherwise not. But often it is not enough and the service has to be able to make a decision based on finer-grained information about the authenticated party. A simple example of such a decision is role-based access, which is very common, and is also supported out of the box by many server platforms. For instance, the service has a resource that it wants to be available only to users in the ADMIN role. When an authenticated request arrives, the service has to map the authentication to some account data and extract the role assignments. It then checks if the ADMIN role is present and if so grants access, otherwise not.

For role-based access to make sense you need to be able to categorize user accounts. In other words. there have to be at least two classes of users (e.g., USER and ADMIN, which implies at least two users), and someone has to manage the assignments of accounts to roles. This is a good deal more complicated than simple shared password data, but as long as the account data is static it can still be managed in a config file, or in a database otherwise. Crucially, this is still easy to do, as long as the system stays simple and can manage its own account data, or get access to an external system where the data is managed for it.

More complicated access decisions involving different and more detailed fine-grained business data are also possible (even common), but those use cases are beyond the scope fof this article. It’s time to start looking at other things and get down to the OAuth2 discussion.

OAuth2 and Centralized Identity Management

As a system grows and the number of components that need access to authentication services and to user account data increases, or the user account data is changing frequently or growing rapidly, the simple approaches to security will start to become a maintenance problem. What is needed is a centralized approach to identity management, and a service for the whole system that can provide and manage that data in a secure way. This is where standards like OAuth2 come into play. They introduce a new layer in the architecture, and some extra complexity that can be hard to get to grips with at first, but the benefits will be worth it in many cases.

Quick Introduction to OAuth2

OAuth2 is a protocol enabling a Client application, often a web application, to act on behalf of a User, but with the User’s permission. The actions a Client is allowed to perform are carried out by a Resource Server (another web application or web service), and the User approves the actions by telling an Authorization Server that he trusts the Client to do what it is asking. Clients can also act as themselves (not on behalf of a User) if they are permitted to do so by the Authorization Server.

The most common way for a Client to present itself to a Resource Server is using a bearer token, as covered in the core OAuth2 specification. The token is obtained from the Authorization Server, with the User’s approval if necessary, and stored by the Client. Then when it needs to access a Resource Server the Client sends a special HTTP header in the form:

Authorization: Bearer <TOKEN_VALUE>

The token value is opaque to a client, but can be decoded by a Resource Server so it can check that the Client and User have permission to access the requested resource.

Common examples of Authorization Servers on the Internet are Facebook, Google, and Cloud Foundry all of which also provide Resource Servers (the Graph API in the case of Facebook, the Google APIs in the case of Google, and the Cloud Controller in the case of Cloud Foundry).

OAuth2 and the Lightweight Service

OAuth2 is, at its heart, an authentication protocol for lightweight services, which are Resource Servers in the domain language of the specification. A Client application that wants to access a protected resource sends an authorization header, a bit like in the Basic authentication case. E.g., in Ruby:

auth = "Bearer #{token}"
response = RestClient.get("https://myhost/resource",:authorization=>auth)

or on the command line:

$ curl -H "Authorization: Bearer $TOKEN" https://myhost/resource

As with the Basic authentication, the mechanics are extremely simple, and that is one thing that makes OAuth2 bearer tokens attractive for clients of lightweight services.

The token is opaque to the Client, but the Resource Server can decode it into some finer grained information about the Client and the level of access that the token represents. It then uses this information to make an access decision. This is not the only feature of OAuth2, but on its own makes OAuth2 a lot more powerful than simple Basic authentication: The token itself carries more information than a Basic header.

A typical minimal set of information in a token would be the Client ID, a target Resource ID and a set of approved scopes (more on those later). Normally the Resource Server has an ID, and it should check that it matches the one in the token in order to prevent token misuse. Beyond that, it is entirely up to the Resource Server what to do with the decoded token data. If the Resource Server is able to coordinate with the Authorization Server, it can arrange for specific fine-grained data to be encoded in the token, relating to the access decision it wants to make. For example the Authorization Server might include role assignments that are recognized by the Resource Server. Tokens could, and do, also contain other information, like a unique identifier for the current user (if there is one).

The UAA stores group assignments in its native user data, and those show up in the access tokens as scopes. The scope values themselves in the UAA are period-separated, but it is up to the Resource Server to interpret them any way it needs. For example, a Resource Server calling itself ‘dashboard’ might change its access decision based on whether it finds a scope ‘dashboard.user’ or ‘dashboard.admin’, and those are the names of groups in the user accounts. These semantics are not part of any standard specification, but OAuth2 leaves them up to implementations to interpret according to their need, so the UAA takes advantage of that.

The Role of the Client Application

For a Client, obtaining a token makes things slightly complicated, and in this case it is usually not enough just to know a username and password. Also, one of the key reasons for OAuth2 to exist is so that Client applications do not need to collect user credentials (as they did with Basic authentication). Here is where the learning curve for OAuth2 gets steeper.

The Client has to obtain an access token, and to do that it has to be pre-registered with the Authorization Server, and it has to authenticate itself at the token endpoint. The UAA uses Basic authentication for this endpoint, as suggested by the OAuth2 specification. If a Client is not acting on behalf of a User, the Authorization Server might allow it to obtain tokens in its own right, directly from the token endpoint. If a client with id “myclient” has been registered with the relevant authorized grant type you could obtain a token like this:

$ curl "https://myclient:mysecret@uaa.cloudfoundry.com" -d grant_type=client_credentials -d client_id=myclient

The result would be JSON containing an access token and some meta data, for example:

{ 
  access_token: FUYGKRWFG.jhdfgair7fylzshjg.o98q47tgh.fljgh,
  expires_in: 43200,
  client_id: myclient,
  scope: uaa.admin 
}

(This is not a valid token, nor is it a valid client id for the UAA on cloudfoundry.com, just dummy data for illustration purposes.)

The example so far has been for a Client authenticating and obtaining an access token in its own right, which the specification calls the Client Credentials Grant Type. Another grant type is Authorization Code. This is the most common and one that allows the Client to delegate the authentication of Users. Application providers then need to educate Users that they should never enter their credentials in any other application than the Authorization Server (or its trusted delegates), so that a malicious Client application has less chance of tricking a User into revealing his credentials. This use case is one of the main reasons to use OAuth2 and a big driver of the OAuth2 specification itself.

The Authorization Code token grant proceeds in four steps:

Sequence diagram, auth-code-flow, with an already authenticated user (steps 2-4):

  1. Authorization Server authenticates the User any way necessary. This step is required but is specific to the Authorization Server implementation and not part of the OAuth2 specification (e.g., it is different for Google and Cloud Foundry).
  2. Client starts the authorization flow and obtains approval from the Authorization Server to act on the User’s behalf. The approval is required, but the details are not specified in the OAuth2 specification.
  3. At this point, if successful, the Authorization Server issues an authorization code (opaque one-time token). This step, and the next one, are described in detail in the OAuth2 specification.
  4. Client exchanges the authorization code for an access token.

Because there are multiple steps here and normally the first two would require a short conversation with the User, the Authorization Server has to direct the flow through the use of HTTP redirects (status code 302 and a Location header). The Client specifically asks for a redirect to itself in order to obtain the authorization code at step 3. One of the security threats in OAuth2 is a malicious Client stealing tokens by asking for an arbitrary redirect, so Authorization Servers protect against this by requiring Clients to register one or more redirect URIs.

The Authorization Server is so called because it provides an interface for users to confirm that they authorize the Client to act on their behalf (step 2 above). The specification wisely leaves the details entirely up to the implementation, so how the authorization (a.k.a approval) is obtained is unspecified. The UAA, and the Spring Security Oauth2 project that it builds on, provide a simple form-based interface in the general case, but also allow auto-approval of certain clients (e.g., if they are deemed by the Authorization Server owners to be part of the platform).

Client Registration and Scopes

The OAuth2 specification mentions Client registration explicitly and it is an important part of the protection against various security threats. It is also the first point of contact for most application developers with OAuth2, e.g., it’s a pre-condition if you want to look at Google calendar data, or manage the applications deployed on Cloud Foundry. There are some core elements of a Client registration required by the specification (an identifier and a secret if the Client is trusted), and several that are recommended (legal scope values, and registered redirect URIs). Authorization Servers often require additional information, describing the application and the owner of the registration.

The scope values are the hardest to understand for a newbie because they are arbitrary strings and don’t necessarily mean anything to the Client at first. Fortunately for the User, the Authorization Server can usually find a default set of scopes from somewhere, and usually knows how to render the values in a way that a User will understand (e.g., from Facebook, “do you allow this application (<URL>) to access your personal data, including email address and photos”). But for a hapless Client owner it can be quite hard to discover the valid scope values and get them registered. In the end, the Client might have to send a speculative request and look at the response from the Resource Server: The specification recommends that valid scope values are included in the HTTP 403 response as part of the WWW-Authenticate header.

You can find out (some of) the valid Google API scopes in the OAuth2 playground–they are mostly URLs starting with “https://www.googleapis.com” (but some are not, maybe the older ones). The API docs themselves (i.e., the Resource Servers) are very thin on scope documentation.

In the case of the UAA there is a UAA Security Model document that lists the known scope values as used by the UAA itself and the Resource Servers that it knows about in the Cloud Foundry platform (e.g., the Cloud Controller). In theory, though, and this is quite a powerful feature of OAuth2, there is nothing to stop a Client registering any value as a scope, and a Resource Server that is completely unknown to the Authorization Server can accept that value. To protect against malicious clients, the UAA extracts a resource ID dynamically from a scope value (scope up to the last period is a resource ID) and includes it in the access token as the “aud” (audience) field. We hope that eventually this will provide an opportunity for the platform to be used as a general purpose OAuth2 provider for all applications, not just those that belong to or consume the platform.

The UAA on cloudfoundry.com at present is only accepting Client registration by a manual process through the Product Management group in Cloud Foundry, but if you would like to deploy a Client using, or a Resource Server accepting, cloudfoundry.com bearer tokens, feel free to contact us.

Authentication and the Authorization Server

OAuth2 on its own does not provide an authentication protocol for the Users of the Authorization Server. It strongly suggests that Client applications should use Basic authentication for accessing the token endpoint, but it says nothing about the authentication of Users when their approval is needed for a token grant (only that they must be authenticated). This is a good thing because it makes authentication completely orthogonal to the approval process, and Authorization Servers are free to implement the authentication any way they choose.

Out of the box, the UAA supports form-based authentication backed by a database of user accounts. But because authentication is not part of the OAuth2 specification, it is easy to modify to support other authentication mechanisms or data sources. For example, with a few lines of configuration you could switch from form-based authentication to Open ID authentication with Google or Yahoo, or to an enterprise directory service (e.g., Active Directory or OpenLDAP) for the data store.

The authentication logic in the UAA can also be extracted out into a completely separate server component, so it can be independently developed and styled. There is a sample login server which allows authentication either with Open ID (Google, Yahoo, etc.) or using an HTML form and a password in the UAA database. Modifying it to support a SAML IDP such as VMWare Horizon or an AD back end for the user account data is trivial because both are well supported by the underlying Spring Security stack.

Alternatives

We have seen that there is a trade-off between the complexity of OAuth2 and a simpler system based purely on shared secrets and a shared user account database. There are other alternatives that are equally or more complex than OAuth2, the main example being SAML. SAML and OAuth2 share a lot of aims, but the implementation is different, and the touch points for developers are very different. SAML is extremely rich and has many features and offshoots that users find indispensable, and it is by default very secure. It is also hard to set up and configure, and often leads to very large amounts of XML being sent with each HTTP message (often so much that it would dwarf the average JSON payload). This last point is probably the thing that puts people off SAML for the sort of lightweight services discussed here, but there is no reason, in principle, why it could not be used.

In environments where SAML is already well established there are also ways to combine it with OAuth2. For example, the specifications allow for a SAML grant in which a client exchanges a SAML assertion for an OAuth2 token, which can then be used as if it came from any other source.

Another alternative to OAuth2 is to write your own system with the same or a subset of features. Many systems have been forced to adopt this approach just because the specifications have been rather long in gestation. There’s nothing intrinsically wrong with doing it yourself, especially if your needs are simple, but many people feel that with security especially there is value in using a standard protocol. Security is very high profile these days and any mistake you make with a non-standard implementation is going to reflect badly if (or rather when) things go wrong.

In Conclusion

So what’s the punchline? Why would you use OAuth2 to secure a lightweight service? It certainly isn’t the simplest approach you could possibly adopt, so you have to be aware of the benefits that would lead you to accept the extra complexity. OAuth2 is well suited to lightweight services in medium to large distributed systems, where User and Client account data need to be centralized, and where you need some control over the expiry and validity of access tokens. The fine-grained information that can be packed into an access token (and which is derived from the centrally managed account data) provides significant benefits over a simple protocol like HTTP Basic authentication. Fortunately, the complexity can also be hidden by client libraries (e.g., Spring Security OAuth2 for Java, or the UAA Gem or Signet for Ruby, or JSO for JQuery). In addition, since OAuth2 is a commonly used standard these days, he number and quality of client libraries is increasing steadily. To put things round the other way, I wouldn’t bother with my own OAuth2 provider if my system only had one or two Resource Servers, or if Client applications never had to act on behalf of a User. I would consider using something simple like Basic authentication, and when I needed more I might switch to someone else’s OAuth2 provider (e.g., Cloud Foundry) if it supported the features I needed.

Naturally, this short article is not going to answer all the questions that arise about OAuth2 and identity management. If you want more information, join the community: check out the source code of the UAA and hang out on the Spring Security OAuth Forum or the Cloud Foundry Developers Group.

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email

Polishing Cloud Foundry’s Ruby Gem Support

The Cloud Foundry team has released new features to improve management of Ruby gems in apps running on CloudFoundry.com. These features include support for using Git URLs in Gemfiles, handling of the BUNDLE_WITHOUT environment variable, and platform specification to control the gem installation process. With these improvements, it is now easier than ever to get your existing Ruby projects up and running on CloudFoundry.com.

History

In 2003, RubyGems was launched as Ruby’s package manager. Six years later, Rubyists began using Bundler–a means for managing and installing gem dependencies in the context of an application. The combination of these two technologies has given developers the ability to run Ruby applications without having to worry about the specific gem version, gem source, or the platform that is available on the server.

In this blog, we will review the changes in the Cloud Foundry code that provide better support for gems. We will discuss using Git URLs as a gem source, how you can use BUNDLE_WITHOUT to manage gem groups, and how Cloud Foundry installs only the platform specific gems that make sense.

Using Git URLs as a Gems Source

Most of the time, developers use default “rubygems” source to fetch gems from the official Ruby Gems repository. Alternatively, Bundler supports Git source URLs in order to associate a gem name and version with a certain Git repository. In this latter scenario, Bundler will automatically clone the latest version of a gem and install it. As of today, CloudFoundry.com fully supports using these Git URLs.

How does it work?

In the same way that Bundler installs gems from a Git source URL, resolving Git branches and references, Cloud Foundry locates Git dependencies in the Gemfile.lock, fetches the source code, and checks out the specified revision. Next, Cloud Foundry will find all of the gemspecs, build the gems and inject them into the application exactly where Bundler expects to see them. When the application gets started via “bundle exec,” Bundler picks up all installed dependencies as usual.

To optimize the staging process, CloudFoundry.com also caches fetched Git sources. For example, if you reference the Rails Git URL, CloudFoundry.com clones the repository and caches it in the local filesystem, so the next request for Rails will use the cache. If a requested revision is missing from the cache, there is no need to clone from scratch because only the missing objects will get downloaded from the original repo.

Example

Let’s take a look at the application padrino shortener-demo. This demo is using the latest version of Padrino (a Sinatra-based web framework). As we can see in application Gemfile, gem padrino is requested to be provided from GitHub:

...
gem 'padrino', :git => 'git://github.com/padrino/padrino-framework.git'
...

The padrino gem Git revision was locked in Gemfile.lock.

...
GIT
remote: git://github.com/padrino/padrino-framework.git
revision: 17c748f8173185e57f9254829f53ee34327fa90d
specs:
padrino (0.10.1)
...

As we push this application, we provide the MongoDB service.

$ vmc push shortener
Would you like to deploy from the current directory? [Yn]:
Detected a Rack Application, is this correct? [Yn]:
Application Deployed URL [shortener.cloudfoundry.com]:
Memory reservation (128M, 256M, 512M, 1G, 2G) [128M]:
How many instances? [1]:
Bind existing services to 'shortener'? [yN]:
Create services to bind to 'shortener'? [yN]: y
1: mongodb
2: mysql
3: postgresql
4: rabbitmq
5: redis
What kind of service?: 1
Specify the name of the service [mongodb-9e641]: mongodb-shortener
Create another? [yN]:
Would you like to save this configuration? [yN]:
Creating Application: OK
Creating Service [mongodb-shortener]: OK
Binding Service [mongodb-shortener]: OK
Uploading Application:
Checking for available resources: OK
Processing resources: OK
Packing application: OK
Uploading (38K): OK
Push Status: OK
Staging Application 'shortener': OK
Starting Application 'shortener': OK

Now, if we check application logs, we can see that padrino was provided to the application among other gems:

$ vmc logs shortener
====> /logs/staging.log 3.2.3 <====
….
Need to fetch mongo-1.3.1.gem from RubyGems
Adding mongo-1.3.1.gem to app...
Need to fetch bson-1.3.1.gem from RubyGems
Adding bson-1.3.1.gem to app...
Need to fetch tzinfo-0.3.29.gem from RubyGems
Adding tzinfo-0.3.29.gem to app...
Need to fetch padrino-0.10.1.gem from Git source
Adding padrino-0.10.1.gem to app...
Need to fetch http_router-0.10.2.gem from RubyGems
Adding http_router-0.10.2.gem to app...
Need to fetch rack-1.3.2.gem from RubyGems
Adding rack-1.3.2.gem to app...
...

And now we can generate a shortened URL and track its visitors.


There may be situations where using the official published gem versions is not enough, such as using the current HEAD of the project, or specific branch, tag or fork. Cloud Foundry supports Git URLs in Gemfile, so it’s easy to point to a Git repo with the exact version of the library you need, and it will be downloaded and packaged as a part of your Cloud Foundry app.

Using BUNDLE_WITHOUT to Manage Gem Groups

Gemfiles support the grouping of gems so that a test server, for example, can get a different group of gems than a production one. The second feature we are announcing is that Cloud Foundry allows developers to take advantage of these groups by using the BUNDLE_WITHOUT environment variable, just as you would locally. Setting BUNDLE_WITHOUT causes Cloud Foundry to skip installation of gems in excluded groups.

Example

BUNDLE_WITHOUT is particularly useful for Rails applications, where there are typically “assets” and “development” gem groups containing gems that are not needed when the app runs in production.

Let’s take a look at an example. Spacely is a Rails 3.2 application that provides image upload via drag and drop. The Gemfile contains several gems in a group called “assets.”

...
group :assets do
  gem 'sass-rails',   '~> 3.2.3'
  gem 'coffee-rails', '~> 3.2.1'

  # See https://github.com/sstephenson/execjs#readme for more supported runtimes
  # gem 'therubyracer', :platform => :ruby

  gem 'uglifier', '>= 1.0.3'
end
...

Let’s push the application to CloudFoundry.com without the gems in the “assets” group. We need to run “bundle install” first to generate a Gemfile.lock, which Cloud Foundry requires. Spacely includes a VMC manifest.yml file, so we can easily push the app without the full interaction. Notice that we push the app with the “–no-start” flag, so we can set the BUNDLE_WITHOUT environment variable before starting the application. We will make this step easier in the new version of VMC by enhancing the manifest support.

$ bundle install
$ vmc push --no-start
Would you like to deploy from the current directory? [Yn]: 
Pushing application 'spacely'...
Creating Application: OK
Creating Service [spaceltdb]: OK
Binding Service [spaceltdb]: OK
Uploading Application:
  Checking for available resources: OK
  Processing resources: OK
  Packing application: OK
  Uploading (6M): OK   
Push Status: OK

Now let’s set BUNDLE_WITHOUT and start the application:

$ vmc env-add spacely BUNDLE_WITHOUT=assets
Adding Environment Variable [BUNDLE_WITHOUT=assets]: OK
$ vmc start spacely
Staging Application 'spacely’: OK                                               
Starting Application 'spacely': OK

Now, if we check application logs, we can see that gems such as “sass-rails” are not installed.

$ vmc logs spacely
====> /logs/staging.log <====
….
Adding carrierwave-0.6.1.gem to app...
Adding activesupport-3.2.6.gem to app...
Adding i18n-0.6.0.gem to app...
Adding multi_json-1.3.6.gem to app...
Adding activemodel-3.2.6.gem to app...
Adding builder-3.0.0.gem to app...
Adding fog-1.5.0.gem to app...
...

Cloud Foundry also supports the exclusion of multiple groups. For example, if Spacely included a “test” group, we could have excluded gems in both assets and tests with “vmc env-add spacely BUNDLE_WITHOUT=assets:tests”.

Excluding Gems by Platform

Bundler allows developers to use the “platforms” method in Gemfiles to specify that a gem be used on particular platforms. This is the final piece of polish for gems that we are adding today. Cloud Foundry will skip the installation of gems on irrelevant platforms, as it should.

The following Gemfile specifies that the rb-inotify gem should be used in non-Windows environments, while three other gems are for Windows only. When this app is deployed to CloudFoundry.com, only the rb-inotify gem will be installed.

# Unix Rubies (OSX, Linux)
platform :ruby do
  gem 'rb-inotify'
end

# Windows Rubies (RubyInstaller)
platforms :mswin, :mingw do
  gem 'eventmachine-win32'
  gem 'win32-changenotify'
  gem 'win32-event'
end

The “platforms” designation can also be used to selectively install gems based on Ruby versions. For example, certain gems can be excluded when switching between Ruby 1.8 and Ruby 1.9.

Support for Windows Gemfiles

When a Gemfile.lock is generated on a Windows machine, it often contains gems with Windows-specific versions. This results in versions of gems such as mysql2, thin, and pg containing “-x86-mingw32.”

For example, a Gemfile that contains the following:

gem ‘sinatra’
gem ‘mysql2’
gem ‘json’

When you run “bundle install” with the above Gemfile on a Windows machine, it would result in the following Gemfile.lock:

GEM
  remote: http://rubygems.org/
  specs:
    json (1.7.3)
    mysql2 (0.3.11-x86-mingw32)
    rack (1.4.1)
    rack-protection (1.2.0)
      rack
    sinatra (1.3.2)
      rack (~> 1.3, >= 1.3.6)
      rack-protection (~> 1.2)
      tilt (~> 1.3, >= 1.3.3)
    tilt (1.3.3)

PLATFORMS
  x86-mingw32

DEPENDENCIES
  json
  mysql2
  sinatra

Notice the “-x86-mingw32” in mysql2′s version number. Previously this would cause a failure on deployment, as Cloud Foundry would attempt to install the Windows-specific gem. However, now Cloud Foundry will install the Ubuntu-friendly version of these gems without requiring any modification to Gemfile.lock. Developers can seamlessly migrate their app between local Windows machines and CloudFoundry.com.

Conclusion

The ability to use Git URLs as a dependency source and specify which gem groups should be installed depending on the platform provides greater flexibility for Ruby developers. The work we have done to enhance gem and bundler support is part of our commitment to providing a Platform as a Service that meets real needs of Rubyists building and deploying applications in the cloud. Follow us on Twitter at @cloudfoundry and let us know how these new features are working for you!

- Jennifer Hickey and Maria Shaldibina
The Cloud Foundry Team

Facebook Twitter Linkedin Digg Delicious Reddit Stumbleupon Email