Rails and the Amazonian Beanstalk

Yo, Christian here .. One of the ways in which I try and keep in touch with the development community is by of course developing software. For those playing along at home that wouldn’t have come as much of a surprise, you can see a few previous posts tagged with ruby, and especially our interest in developing software that may help either secure your apps, or secure your processes (watch this space!).

Anyway, in addition to my interest in development, I’m also interested in operating these applications, especially leveraging the power of ‘the cloud’. Some of these principles may be referred to by some people as Development Operations .. or some such. Heroku is one of the more popular Platform-as-a-Service operators, their model is pretty slick. Sign up, git commit your code, and then just git push it and away you go. During my experiments with them I was also interested in leveraging Amazon’s CDN platform, CloudFront, which, with the help of the asset_sync gem was relatively simple. At a high level the steps are:

  1. 1) Code up your app
  2. 2) Git commit your code
  3. 3) Git push your code to heroku
  4. 4) During its deployment your static assets would be compiled, compressed, mashed-together
  5. 5) Asset_sync would then push these up to your nominated Amazon S3 bucket
  6. 6) Which in turn was published through the CDN (CloudFront)

Somewhere along the line though this stopped playing friendly, and after a few rounds of frustration, I decided to jump ship. Heroku, whilst offering some great benefits and simplicity to the whole continuous delivery process, also potentially encapsulated a lot of the gritty details away from you. Heroku, of course, leverage’s Amazon’s EC2. So why not go straight to the source?

Amazon’s approach to Platform-as-a-Service, also known as their Elastic Beanstalk (EB), was always a little bit daunting, and when I first heard about it, and its lack of support for Ruby(/Rails) I wasn’t all that interested. Well, those days are over, their model now supports Ruby 1.9.3, and of course Rails on top of that. Simply put, EB wraps up a fairly automatic approach to managing applications on top of their other services, namely:

  1. 1) EC2 – Elastic Cloud Computing – scalable web app servers for the controllers and view handling
  2. 2) RDS – Relational Database Service – for the backend model handling
  3. 3) S3 – Storage – for handling code distribution and log file management (so you don’t have to interact directly with your EC2s or RDSs)
  4. 4) SNS – Simple Notification Service – for handling email alerting, health checks etc
  5. 5) CloudWatch – for monitoring the health of your app, and automatically scaling those EC2 automatically
  6. 6) ELB – Elastic Load Balancing – to present a single DNS entry, which encapsulates those EC2 nodes away

The (I believe) official blog from AWS on their Elastic Beanstalk stuff offers a lot of interesting insight into how to run up these environments, but, I thought I should quickly dump out a few things that were causing me issues.

Firstly, the command line tools for starting, stopping, initialising your EB workload has a few problems. I don’t believe they’re hosted on GH, so I can’t send them a push request. One problem that was causing me a bit of grief was the functionality to automatically add the application-local ‘.elasticbeanstalk’ folder to your ‘.gitignore’ file. This functionality occurs on ‘eb init’, ‘eb start’, in fact, many of the eb functions. Firstly, the wrong entry was being added (at least on my OSX with Python 2.7 setup), and secondly, it wasn’t being added correctly so it would get incorrectly added every single time I ran any of these commands, which obviously didn’t work. My fix was simple.

In “AWS-ElasticBeanstalk-CLI-2.2/eb/macosx/python2.7/scli/constants.py”, change line 466 – 467 from:


    Name = Path + u'/'
    NameRe = Path + u'/'

To:


    Name = u'/' + Path
    NameRe = u'/' + Path

This ensures that the correct entry is searched for in the .gitignore file, and added as well.

Then, in “AWS-ElasticBeanstalk-CLI-2.2/eb/macosx/python2.7/scli/config_file.py”, change line 152 from:


    f.write(u'{0}'.format(name))

To:


    f.write(u'\n{0}'.format(name))

This ensured that the entries are added as new lines to the bottom of the .gitignore file properly.

The other thing I found helped during testing was jumping directly onto the EC2 nodes and tailing various log files. By default, your EB deployed EC2s don’t have SSH pub keys set, nor does the security group permit SSHing to them. Setting up your SSH keys is simple, make sure you’ve got some already created for the region where your app is, then either re-run ‘eb init’ and specify the keypair, or, edit your ‘.elasticbeanstalk/optionsettings’ file and update the ‘EC2KeyName=’ setting to the name of your keypair. After that’s done and you’ve re-started your environment (‘eb stop; eb start’) you will then have to jump into your EC2 console and modify the security group to permit SSH.

Voila! You can now SSH directly onto your EC2 nodes, and tail some interesting log files, I found the following of particular interest:

  1. 1) /var/log/eb-tools.log – shows you activity when you push new code, asset compilation, bundler runs etc
  2. 2) /var/app/support/logs/production.log – this is the actual rails log – see web requests etc

All in all, I’m really enjoying what I’m seeing. Sure, it may cost a bit more than running a single Dyno on Heroku, but, it certainly gives you a lot more control and visibility into exactly what’s happening, plus, it’s simple enough to jump straight into the deep end and see exactly what your servers are doing.