Building an Internal Gem Infrastructure: A Crash Course

Rails, Ruby

I gave this short talk as part of RailsConf 2014. LivingSocial graciously sponsored my trip. We’re hiring awesome engineers if you didn’t already know. Hit me up on twitter @ubermajestix if you’re interested in working with a great team.

Image

Image

Being able to build, release, and host your own gems is really helpful. This post will introduce the basics of doing all three of these things. This is more of a motivation post as there are tutorials out there with more depth and step by step instruction than I will provide here.

Why

Internal Gem Infrastructure.002

Moving code into gems really helps when you’re splitting up monolithic apps or building a service oriented architecture. Sharing code with gems can reduce copy/paste which reduces the mayhem maintenance costs it imposes.

Internal Gem Infrastructure.003

Building a gem infrastructure can help foster an “internal open source” culture where anyone can contribute. Gems get their own
repo and readme. You get to reap all the benefits of pull requests and open contribution within your company.

Internal Gem Infrastructure.004

We’ve heard experts say “go ahead, make a mess” but it’s often hard to figure out where to put your mess in a Rails project. A gem is a great place to make a mess of small classes with single responsibilities. Putting your code into a gem forces you to draw boundaries around your code. You get your own tests! You can separate concerns.

Internal Gem Infrastructure.005

There are a bunch of benefits to isolating your code. First, your code is really easy to delete when it’s no longer needed. You don’t even have to delete your gem, you can just remove the calls to your gem from other gems or apps that use it; its documentation and tests live on. It might even be resurrected in the future. This is so much easier than searching through old commits for “that code that did that one thing”.

Also, you have to think about how your code will interact with Rails. Does it need database access? Where should logs go? Do you even need Rails for this code to do its job? This has helped me constrain my code to its simplest form, use the smallest possible pieces of Rails, mock and stub boundaries in tests, and write good docs.

An Example

I recently was tasked with automating the generation of a daily report which needed to be uploaded to a third party’s FTP. The processing results are then emailed back to us as a plain text file. Sometimes I need to parse the email if anything went wrong. As a coworker said, 1997 called, it wants its architecture back. But those are the constraints we have to work in.

I could have plunked down the CSV and FTP code right in the database model, maybe thrown the email parsing code in a helper,  and called it a day. We all know that’s gross. We’ve all done it and justified it, “Gets the job done right? Fat models bro!”

Image

But you’ve also been on the other side a week later, a month later, or even years later, “Ah man, why is all this code here?” and “Do we even use this anymore?”, “What does this even do?”

Image

So I decided to cut a gem, and put all this domain specific code in one place. It’s a pretty simple gem, and that’s ok. It’s easy to describe and document, but it also let us have the conversation of where the code should be run from, instead of just throwing it in the monolith and moving on.

Image

Building A Gem

Ok this is a crash course inside the crash course on bulding your own gem. This may be review for some of you or it may be new to you. If you haven’t written a gem, I highly recommend trying as it’s a great addition to your Ruby tool belt.

There are a bunch of ways to build a gem, but the easiest one I’ve used is Bundler’s gem command. The best part is it’s probably already installed on your machine. Just run:

bundle gem ls-super_awesomeness

Here’s what it creates for you:

Image

  • The most important file is the gemspec there at the bottom. Without that file this isn’t a gem. You can readup on the details online here and here, but the version Bundler generates is pretty straight forward.
  • A Gemfile
  • A License (MIT)
  • A README
  • A Rakefile which will help you build, install, and release
  • Your code goes in lib
  • If your namespacing you’ll have a namespace directory with your files in it
  • Finally your version file deep down in there.

Namespacing is something I think is really important and bundler will help you do this. Here’s how it works. You just shorten your organization’s name into two letters. It can be more or less, but I’ve always seen two characters in the wild. It’s important because it tells your reader that this is internal code and not an open source implementation.

Image

Ok, so now a little bit about versioning, again this might be a refresher, but I feel like I explain this more than I have to. We liberally use the semantic versioning spec to control version numbers at LivingSocial. Checkout semver.org for the full spec.

Image

The bugfix number is the number furthest to the left, it should be incremented when bugs are fixed or really small changes happen that don’t affect your public API. The minor version number is the middle number and should be incremented when you’ve added a new feature or additional functionality but your haven’t removed or broken the public API. Consumers of your code will not be negatively affected by upgrading. The major version number should be incremented when a breaking change is made. This usually means you’ve removed some functionality, extracted it to another gem, or have changed the public API. You should have a very good reason for breaking the public API as all consumers are affected and will have to upgrade their code. Think about how hard it is to upgrade from Rails 2 to 3 and from 3 to 4.

If a gem is used in production, is stable and reliable, then it should be at 1.0.0. I’m wary of deploying gems that aren’t 1.0.0, if you’re in production and the API is stable and your version is 0.0.1237, you’re doing it wrong.

Image

Letting The World Behold Your Awesomeness

Ok so you’ve built your gem and got your version number figured out. Now, we need to let the world behold your awesomeness, or at least others in your company.

It should be EASY for anyone to release their code. Having a bunch of commands to release code is messy and error prone, even just three commands.If you use bundler to create your gem, `rake release` will do all this for you but we’ll need to modify its behavior to so you…

Image

But be careful! You can accidentally open source your code. It happens! You might have sensitive information in your gem and there are bots that mirror gems as they hit rubygems.org, so once that code is out there, it’s not coming back.

It should also be easy to add a version tag in git. This helps people figure out what commits went into a release and Github makes version tags really easy to use. Bundler’s `rake release` task will use the version number from your VERSION constant in your version.rb file, so make sure you update your version number before you release.

So how do we get `rake release` to not open source your code?

We monkeypatch subclass bundler’s gem_tasks. If you look in the Rakefile bundler generated for you, you’ll see require ‘bundler/gem_tasks’. This is how you get the build, install, and release rake tasks. At LivingSocial is we built our own gem that all other gems use to get their default rake tasks. If you subclass bundler you can add your own rake tasks and most importantly keep developers from pushing their code to the wrong place and help them push their code to the right place.

Here’s the magic code. Notice how we disable the rubygems push and raise an error just in case. Then, we add a method that pushes to our geminabox server. From here we just replace the bundler/gem_tasks with your customized gem_tasks in your Rakefiles. Boom, releasing and tagging made easy.

Image

Image

Hosting Your Gems

Now you’ll need a place to store all these gems. Here are the options that I would recommend you use.

Image

We use geminabox at LivingSocial. It gets the job done. It hosts our gems and provides a web interface to manage and view gems. It will support authentication through custom middleware (their wiki page has examples). It can be setup to do pull through mirroring of rubygems. Mirroring is useful when you don’t want to depend on rubygems.org when you deploy and you don’t want to vendor all your gems. It also provides a CLI by patching your system gem command with inabox. You can use that command to push gems and configure your machine to talk to your geminabox instance.

I’ve also used stickler at a previous company. It supports hosting and has a nice web interface. It supports auth in a similar way to geminabox. It will mirror gems from rubygems.org but you have to tell it which gems to mirror, but it does support mirroring everything in your Gemfile.lock. It has a really nice CLI and the author wanted me to mention that it will soon support the new bundler API. There’s a great blog post about setting it up over at copiousfreetime’s site.

Both stickler and geminabox are pretty easy to deploy and have lots of documentation online to help you out. In my experience stickler is faster and more reliable than geminabox, but your mileage may vary.

If you want to go the hosted route, Gemfury.com appears to be the only solution out there. It supports auth, but doesn’t appear to support mirroring. It comes with a CLI and also supports nodejs and python packages in addition to ruby gems.

There you have it, a crash course with almost everything you need to setup a gem infrastructure for your company.