Tuesday, June 22, 2010

The opposite of DevOps

I thought about whether or not to write this post but I think it's an interesting example of the kinds of problems that the DevOps methodology is trying to solve.

I turned in my two weeks notice with the paper on Friday. There were several reasons but none of them are a negative reflection on my employer. The role I was originally hired for was no longer valid and there wasn't a transition path because of platform changes in the back office. I could have stayed (and was asked to stay) but there wasn't anything long term where my skill set was useful. As with any key team member, meetings are called to discuss any outstanding issues, transition responsibilities and the like.

In preparation for the meeting today, I drew up a list my responsibilities at a macro level and then broke that down by task. As it's always been for me, that list spanned several "silos" in the traditional IT organization model. Here's a sample snippit:

Subversion User Management, Repository Management
MySQL User Management, Database Management, Performance Management
LinuxUser Management, OS Configuration, Application Management, Code Management
Puppet Configuration Management, Recipe Management
Ruby VM Management, Gem Management

There were also entries for various internal applications that I've worked on (including some development) and supported from an operational perspective. Mind you, I was embedded as an operations guy with a specific development group in the organization. Sort of a DevOps-lite role.

What was really interesting about the meeting was how the lines were broken down. Literally lines were drawn as to where operations would stop supporting something and the group I was leaving would take over. Take the Linux example:

Application Management was intended to refer to software that was "standard" as a part of the distro. Things like Apache or MySQL. Code Management refers to our internally developed code (all Rails applications except for the sexy Sinatra webservice I wrote). In the end, the responsibilities were shifted divided like so:

- Operations Team supported up to the installation and configuration of Apache (including vhosts) with the exclusion of Passenger configuration.
- Passenger configuration and internal code would now be managed by the Development group. They would handle deployments themselves via Webistrano.
- MySQL? Passed on to those currently managing the MSSQL database servers.

That's the very definition of a silo'd IT infrastructure. Everything is thrown over the wall. At least the deploys remain with the development team. There is nothing "wrong" with this model of IT governance. It's not agile but many companies use it. Contrast that to the position I'll be starting on the 6th of July.

An explicit group is being formed inside the company called "DevOps". I know that at this point everyone is incredulous. I can hear it now; "You're doing it wrong!". Interestingly enough, I asked the same question during the interview process. We all know that DevOps is model and philosophy and not a title or department. You don't have two development teams - Agile and Waterfall. You have Developers and they use one methodology over the other. The same goes for Operations. The people I was interviewing with were cognizant of this fact as well. The reason the group is being called "DevOps" is strictly for organizational and political reasons. The goal of the team is to actually develop a set of operational and developmental processes,tools and guidelines that embody everything that DevOps represents.

This is being done inside of one division of the company for now with the Director of Development and Director of Operations essentially sharing a brain and heading things up. The work that this group does will establish and codify something that will be used throughout the rest of the company. We'll be doing development of tools, architecting systems and recommending/implementing solutions that will, among other things, define how the organization operates itself from an IT perspective. Additionally we'll be supporting this as any traditional operations team would but, for now, the group has a distinct title. My title is "Systems Architect" which nice and generic enough to apply to both traditional groups ;)

The only remotely distressing part of the whole thing is that I'll probably have to learn some Python. (tongue firmly in cheek). There's a point to be made that distribution vendors have settled on Python as the Lingua Franca of OS management - excluding things like Chef, Puppet and Nagios which have their own respective DSLs that abstract away much of the language they were written in. Yes, my precious Ruby will still be there when it comes to extending, say, Puppet. I'll probably still write quite a few service checks for Nagios in shell (should Nagios be the best fit).

But language wars aside, I think this gives a clear picture of exactly the types of problems the DevOps methodology is trying to solve. Agility and flexibility across all branches of IT produces a leaner, strong and faster organization that can spend less time "in the muck" (as John Willis likes to say) and start making the company more money.

Monday, June 21, 2010

Status of Riak support in Padrino

Back story
(skip to the bottom if you want the nitty gritty status)
So I got this wild hair up my butt a week or so ago. I wanted to get into the Padrino internals and actually contribute to an open source project. I've been using open source software for years. I've made a good living from it. I owe a lot to it.

My problem has always been the fact that I'm not a programmer by education nor by trade. I've always been a systems/architecture guy. Of course, a systems guy is a programmer at some level thanks to shell scripting, administrative scripts and the like. The DevOps philosophy makes that even more tangible by treating your systems in the same way a developer would treat his code. The most I've ever done previously in terms of contribution has been a bug fix here and there or documentation. All very valuable but in some capacity, not as rewarding.

So I've started mucking about with Sinatra. I used it to build some web service gateways for our production environment. That led me to Padrino. I noticed that Padrino didn't have ORM support for a few of the other schemaless/NoSQL databases so I figured it was a good way to contribute. I picked Riak out of the blue because the first episode of the ChangeLog Show I listened to had the Riak guys on there. Little did I know ;)

Riak/Ripple Status
Once I got a handle on the Ruby driver that Sean Cribbs is writing for Riak, I dove right in and forked the Padrino code base. Github makes it EXCEEDINGLY easy to be a contributer and I really think they've ushered in a new wave of open source development.

The Riak ruby driver comes in two flavors - riak-client and ripple. Riak-client is a more "basic" wrapper around riak operations - CRUD, link-walking, map/reduce. Ripple is the "next-gen" (imho) driver that borrows design from ActiveRecord, MongoMapper and DataMapper.

After poking a bit with riak-client, I decided that the ripple driver was much more "in-line" with the other ORMs supported in Padrino. I started to go whole hog before I realized that ripple had some missing functionality I was expecting. This was not anyone's fault but my own. Essentially this prevented me from using the ripple ORM in Padrino's admin interface. Not a big deal but it would have been nice to have.

Sean Cribbs was VERY responsive over twitter and let me know that the features I was looking for would be added. He opened an issue and last night, support for update_attribute/update_attributes was added. I grabbed the latest build and went to town on my local padrino-framework fork.

This is where I ran into another issue. Essentially, the other NoSQL ORMs that padrino supports use some tricks for handling unique keys. I know this is trying to shoehorn RDBMS ideas on top of schema-less databases.

So how do the other ORMs handle it?

  • mongomapper/mongoid - supports 'validates_uniqueness_of' on model definitions
  • couchrest - supports unique records via a map/reduce job
Nothing similar exists in the ripple driver "yet". I say yet because I fired off an email to Sean and got a very well-thought out response.

As a side note, Basho, you hired a good man to bear the title of "Developer Advocate".

Essentially Sean brought up a good point. Since you don't have "transactions or global consistency", there's no good way to guarantee that a key is unique.

I've essentially got two options if I want to continue down the path of support Padrino's admin with Riak - do the map/reduce route or do a check myself before calling save. I'm still deciding the best route to take.

Alternatively, I could simply not worry about it and forgo admin support in the ORM. It's not a deal breaker and it's not a requirement per the Padrino folks. It just would have been nice.

So what's the status?
Right now, using ActiveModel/ActiveSupport 3.0.0.beta4, Ripple from ripple/master and my fork of padrino-framework, I can create models and things work as expected.

The biggest headache for me is having to edit the Gemfile and append the versions for ripple and activesupport. I'm also working on adding gem version support to require_dependencies.

So you wanna try it yourself? Go right ahead. You'll need to grab my fork of the framework (I keep it up to date with master) and seancribbs/ripple. There are rake tasks to build and install those locally. I would HIGHLY suggest you create an RVM gemset. Additionally you'll need to grab ActiveSupport 3.0.0.beta4.

After that:
  • padrino-gen project test -d ripple
  • cd test
  • edit Gemfile and append version 0.7.1 to the ripple line.
  • bundle install
  • padrino-gen model person name:string email:string phone:string

You should now have a working model using riak via ripple. Here's a Gist of me doing exactly that via padrino-console:

Anyway, I'm intent on getting ripple support back into upstream but I'm not going to make a pull request until ActiveSupport reaches 3.0. It feels half baked to have these extra steps to get it working and I'd really like to have admin support done before I do if possible.

Thursday, June 10, 2010

Parsing Nagios Objects in Ruby Redux

When I was with MediaOcean, one of the projects I was working on was a stack of ruby code for parsing/writing nagios configs. This was in parallel with setting up puppet. I actually got pretty far along but then DDS had a RIF and I was back on the market.

I had to put the code aside and just recently pulled it out again. Some/most of it is pretty ugly. I had quite a bit done though.

So here I am "refactoring" (and I use that term VERY VERY loosely) the code base. So I'm happily parsing away - iterating over the cfg_file/cfg_dir entries and build a nice big hash to hold all the object definitions. I dump it to a YAML file for validation when I notice this nice bit of junk in my timeperiod dump:

Oh yeah...that looks right =/

For those who don't know, nagios object definitions are pretty straightforward. Here's an example:

Easy enough to parse, right? Object type named (host). A clear beginning and end denoted by curly braces. Object attributes in a seemingly key/value type markup. Some people smash the opening brace up against the object type definition but that's easily caught.

Iterating over the definition in Ruby is pretty straightforward (assuming a perfect example):

What becomes a problem is timeperiod defintions. The syntax for those is somewhat complicated. Looking back at the YAML fragment above, you can see why. For timeperiods, while the rightmost value is the actual time frame, the left part (usually a day of the week) can actually be a specific day. So if I wanted to create an entry for Christmas in the U.S., I would define it like so:

december 25 00:00-00:00

Things can be even MORE complicated if I wanted to handle non-date holidays:

monday 1 september 00:00-00:00 ; Labor Day (first Monday in September)
thursday -1 november 00:00-00:00 ; Thanksgiving (last Thursday in November)

Ugly, no? I searched high and low for the ability to split starting from the right and didn't find anything native. I had to resort to implementing my own rsplit method for String:

It works but I also can't use it across the board. If I do, I break things that were working (alias, service_description) that can have a space in the value.

I'm not even going to get into trying to convert timeperiod definitions into Date objects yet. That gives me cold sweats.