Friday, September 16, 2011

How Ruby is beating Python in the battle for the Soul of System Administration

Roughly three years ago I purchased the book "Python for Unix and Linux System Administration" and it convinced me that Python would be the default scripting language for linux system administrators for the foreseeable future. I was working on the One Laptop Per Child project at the time, where Python was the lingua franca. It seemed that Red Hat was using Python for almost every new project. Further, Unladen Swallow was making rapid progress. I couldn't have been more wrong.

Through a combination of historical accident and some seemingly minor feature differences, ruby is inexorably becoming the dominant scripting language for linux system administration.

Before I go any farther you die-hard sysadmins are probably screaming "We already have a scripting language, Bash!" While we love bash and commandlinefu, Bash becomes a liability not an asset once your script exceeds 100 lines and a total nightmare if you need to parse or output HTML, CSV, XML, JSON, etc.

Historical Accidents

In 2004, Luke Kanies set about building Puppet, a configuration management system to improve upon the deficiencies he saw in CFEngine. From the excellent On Ruby blog

I tried to implement my idea in perl, but I just couldn’t get the class relationships to work (the attributes and resource types each needed to be classes, according to the design in my head). This was back when Python was the shiznit, so I naturally tried it, but Python just makes my eyes bleed (and no, it wasn’t the whitespace, it was things like the fact that ‘print’ was a statement instead of a function, and ‘len’ was a function instead of a method).

I had a friend who had heard Ruby was cool but hadn’t actually tried it himself. Since I was just messing around at the time, I figured I’d give it a go. Four hours in, never having seen a line of Ruby previously, I had a functional prototype.
A couple years, a number of Puppet developers felt that Puppet didn't meet their needs. They started their own configuration management tool, Chef, heavily inspired by Puppet. The biggest outward difference between Puppet and Chef is that Chef uses pure ruby as its "recipes" whereas Puppet uses its own configuration language based on ruby.

Both Puppet and Chef are seeing rapid adoption by big companies according to BusinessWeek. If you aren't yet using Puppet or Chef, you should be planning on it in the near future. Whether you choose Chef or Puppet, you are effectively scripting your infrastructure with ruby. After spending 25% of your time working with Puppet, you will be much more likely to reach for ruby for your next scripting task.

Popular Projects

Off-hand here is a non-definitive list of sysadmin/devops related projects in ruby
  • puppet
  • chef
  • vagrant
  • mcollective
  • cucumber (behavior-driven testing)
  • capistrano
  • rake (ruby Make)
  • aeolus project/openshift
  • cloud foundry
  • graylog2
  • logstash
  • travis-ci
and in Python:
What's important here isn't the length of the respective lists but the importance of the individual projects. Chef, Puppet, and Vagrant are the new hammers and screwdrivers of system administration. If you are a sysadmin and aren't yet using these tools, don't worry, you will be sooner or later.

Openstack deserves special attention as it is a very exciting but early stage project. It could play a big role in your future as a system administrator.

Please notify me of significant devops projects that are missing from this list.

What a Girl Sysadmin Wants

Here are the features in a scripting language that a sysadmin wants
  • A DSL for the problem domain
  • High productivity, i.e. concise and expressive syntax
  • Easy to interaction with shell commands
  • Regular Expressions
  • powerful one-liners
Notice that performance is not on this list of requirements. Ruby is significantly slower than Python and this was particularly true for the 1.8.* series of Matz Ruby Interpreter (MRI). However, performance just isn't critical for 90% of our work as sysadmins. We care about productivity more than we care about performance. Readability is nice but it is a distant second to productivity.

Python doesn't have regular expressions embedded in the language, probably to improve readability. It also limits the number of top-level built-in global variables. From a language design perspective, this is much cleaner. From the perspective of your average street-trained, stress-laden, vi-addicted sysadmin, it is annoying.

These top-level variables also make one-liners more concise. Here are just a few


$_ last line read from STDIN
$? last exit code from a child process
$stdin reference to stdin
$stderr reference to stdout
$stdout reference to stdout

Here is an irb session to show you these values in action


hitman@hiroko:~/pr$ irb
irb(main):001:0> %x[ ls -a ]
=> ".\n..\nfoo1\nfoo2\nfoo3\n"
irb(main):002:0> puts $?
0
=> nil
irb(main):003:0> %x[ ls xys ]
ls: cannot access xys: No such file or directory
=> ""
irb(main):004:0> puts $?
512
=> nil


"IRB?" you scoff "You will have to tear IPython from my cold dead hands."

I love IPython. I have used IPython for 4+ years and IRB doesn't hold a candle to it. That said, ruby's shortcuts are really useful, enough to compensate for IRB's shortcomings. There is something called wirble that I haven't tried yet that may make irb a lot more productive.


Here some python code to detect whether a machine is a VMware VM.



# edit: fixed python code tks to kstrauser
  import os
    if 'vmware' in os.popen('dmidecode').upper():
        print 'this is a vmware vm'
    else:
        print 'this is not a vmware vm'



Here the same code in ruby

`dmidecode`
if $_ ~= /vmware/i
    puts 'this is a vmware vm'
else
    puts 'this is not a vmware vm'


Frankly, this kind of code makes my eyes bleed. The python example is far more readable and maintainable. That said, a lot of sysadmins would appreciate the terseness of the ruby example.

Let's look at writing one-liners in both languages.

This code prints out the first 10 lines in a file using Python

python -c "import sys; sys.stdout.write(''.join(sys.stdin.readlines()[:10]))" < /path/to/your/file

here is the same in Ruby

ruby -ne 'puts $_ if $. <= 10 ' < /path/to/your/file


Compare for yourself between these collections of oneliners in python and ruby. If ruby reminds of perl, your eyes do not deceive you. In many ways it is the love child of perl and smalltalk.


DSLs FTW

Some time ago, I tried to watch all the Structure and Interpretation of Computing lectures. This endeavor failed miserably but I recall Harold Abelson saying that every large program should have its own internal DSL suited to the problem space. This is debatable in regards to the larger world of programming but I think it is very apt for system administration. We spend our entire careers with n different DSLs. Each different configuration file format is its own DSL.

If you compare rake (Ruby Make) to rails code, they look quite different, almost like different languages. If you compare fabric code to a django class, they look quite similar. This is both a strength and a liability. I am not a language lawyer but it seems much easier to create DSL's (Domain Specific Languages). Ruby certainly spawns DSLs with much greater frequency than python. No single pythonic build tool dominates the problem space like rake does in the ruby community. Most python projects seems to use setup.py for administrative tasks even though that is not its explicit purpose.

Both puppet and chef are DSLs for system administration. Capistrano is a DSL for application deployment. Ruby's operator overriding and blocks lend themselves to the easy creation of DSLs.

In Summary

Ruby's greatest strength is its amazing flexibility. There is a lot of "magic" in ruby and sometimes it is dark magic. Python intentionally has minimal magic. It's greatest strengths are the best practices it enforces across its community. These practices make Python very readable across different projects; they ensure high quality documentation; they make the standard library kick ass. But the fact is that we sysadmins need flexibility more than we need raw power or consistency. Still, these are not the real reasons that ruby is overtaking python.

Ruby is fast becoming the default scripting language for sysadmins because back in 2004, Luke Kanies looked at Python and felt ill (I had the opposite reaction). As a sysadmin you either currently are or soon will be using Puppet or Chef on a daily basis, spending a heck of a lot of time essentially coding in ruby. Personally, I much prefer writing python but am shifting to writing my scripts in ruby because I spend so much time with Chef.

32 comments:

  1. Pry is worth a look for an IRB alternative -- http://pry.github.com/

    It still isn't IPython, but it takes some inspiration from it.

    ReplyDelete
  2. For your VMware/dmidecode example, as always, there is more than one way to write that code.

    output = IO.popen('dmidecode').read
    if output.match(/vmware/i)
    puts 'this is a vmware vm'
    else
    puts 'this is not a vmware vm'

    ReplyDelete
  3. I'm a systemengineer and I have nothing against ruby, python or any other language I actually use python, ruby, bash and perl depending on criteria.

    I seriously doubt any serious sysadmin, system engineer or devops engineer (whatever that may be) will stand behind this particular gem in your blogpost:

    "But the fact is that we sysadmins need flexibility more than we need raw power or consistency. "

    What a load of bull.

    I really really need raw power and consistency and flexibility is a nice to have...

    On top of that the post is littered with statements which seem to originate from your current working environment which are presented as facts. In many other environments those statements are false, debatable or plain wrong....

    puppet != writing scripts in ruby, this is one of the nice things of the dsl that puppet exposes, you can of course but there is no-one forcing you...

    Your ruby example is written in less lines if you write it in bash and I fail to see how that shows off the flexibility of ruby.

    And the last time I looked, capistrano was a dsl for deploying ruby applications and a real pain in the ass to use if you wanted to deploy anything else.......

    ReplyDelete
  4. For the record, here's the ruby example in bash:

    sudo dmidecode -t1 | grep -qiE 'manufacturer:.*vmware' && echo "this is a vmwarehost" || echo "This is not a vmwarehost"

    With the additional benefit that it actually checks for the correct field, instead of relying on the word vmware anywhere in the dmidecode output.....

    Please point out where the ruby example adds flexibility...

    ReplyDelete
  5. @Ramon

    ruby gives you the flexibility to write a DSL for system administration, which is exactly what the chef and puppet folks have done.

    "I actually use python, ruby, bash and perl depending on criteria."

    until recently, that is what I did as well and result for most of us is a needlessly complex infrastructure that is overly customized. 90% of sysadmin scripts can and should be replaced by puppet or chef.

    "puppet != writing scripts in ruby, this is one of the nice things of the dsl that puppet exposes, you can of course but there is no-one forcing you..."

    the deeper you go w/ puppet the more ruby you end up learning

    i really need to post about how important it is to use puppet or chef, even if you have a small operation.

    ReplyDelete
  6. The use of $_ in your VMWare snippet is unecessary:
    if `dmidecode` ~= /vmware/i
    puts 'this is a vmware vm'
    else
    puts 'this is not a vmware vm'
    end

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Your Python example is completely broken. It'd be something like:

    if 'vmware' in os.popen('dmidecode').lower(): print 'this is a vmware vm'

    os.system() just returns the integer exit status of the command, and Not Everything Needs A Regexp.

    ReplyDelete
  9. graylog2 is written mostly in Java.

    For CM, you missed bcfg2 which is written in Python.

    ReplyDelete
  10. I am a sysadmin and I use puppet. Life is much easier and we have only scratched the surface of puppet.

    Do note, puppet sucks for user management. Not because it cannot do it but because customers ask for crazy things when it comes to user access and LDAP with matching user-groups makes life more sane.

    ReplyDelete
  11. Okay... so what about my comment was so horrifyingly objectionable that you had to remove it in it's entirety...? You didn't even leave the additional example...

    ReplyDelete
  12. @bryan: We were probably one of the first large scale adopters of puppet in europe, we've been managing our 3000+ node serverpark with puppet since 2005/2006...

    I have no beef with the fact that you think that people should use puppet or chef even in small scale operations/shops..

    I just don't think that ruby is a magic bullet that should be the new default scripting language for system engineers, devops engineers or sysadmins. I think that depends on the situation you are in, if you're in a greenfield cloud deployment you might find ruby attractive to start with because of the ease of integration with your choosen config management tool

    In addition I don't think that flexibility is the main criteria for selection of a scripting language to develop scripts in, which you were stating in your blogpost

    We're actually trying to replace some of our scripted setups with api's so we can use them regardless of the calling language. For some stuff C or python is more suited than ruby. If it has an api, we can use any scripting
    language to operate/query/change it.

    For some stuff we would like to start using ruby, but nobody in his right mind starts rewriting thousands lines of code in language X to ruby, it's not worth it and we have far better things to do.

    And I think there is a legitimate distinction between config management, which we do with puppet
    and operational scripts which we write in a variety of languages depending on criteria

    deployscripting would be a nice example, check scripts for our monitoring system would be another
    On top of those examples, there is scripts for ldap-management, mysql replication changes
    etc. etc. etc. None of which have anything to do with configmanagement.

    As a last comment on your post, I agree that productivity is one of the things that determines which language to use for a given task, however productivity is highly dependent on someones familiarity with a given tool. I consider myself a ruby apprentice at best, so my productivity drops about 100% when I'm trying to code something up in ruby, were as in python or bash I'd be far more productive simply because I'm more familiar with the programming environment.

    This is a chicken-egg argument, if you never take the productivity hit associated with learning a new language, you will never gain possible productivity gains exposed by a new language...

    Which is a great argument to allow enough slack in the deadline pressure on dev/ops/sysadmin/developers to allow them to learn a new tool / language every now and then. Hopefully it'll lead to much productivity gains in the future :)

    ReplyDelete
  13. @Thomas, I didn't delete your comment. I apologize if I somehow did w/out knowing it. Maybe blogger filtered it?

    @Ramon
    First, thanks for your helpful comment

    My post doesn't argue that everyone should drop what they are doing and learn ruby. I am arguing that I think overtime ruby will become the default scripting language for sysadmins based on the fact that puppet and chef will become the primary configuration management tools for sysadmins.

    I don't plan on rewriting my nagios plugins in ruby, but I may choose to write future ones in ruby.

    I also don't think that ruby is a silver bullet, nor any other programming language or framework.

    ReplyDelete
  14. Actually, as a hybrid developer/sys engineer, I think that ruby is the biggest weakness of puppet/chef. The language has some very nice features but it's inheriting it's "beauty" from perl and the community seems to be in love with new shiney features/tech so much that you can find half finished libraries and tools everywhere and it makes mature software written in ruby hard to find, even puppet (which would be one of the most mature ruby projects around) introduces backwards incompatible changes without obvious reasons and that damages one thing that sysadmins like most of all: stability.

    So yeah, ruby is becoming very popular because of puppet/chef but if a python based configuration engine comes along, I'll be more than willing to try it and contribute to it.

    ReplyDelete
  15. @ainmosni

    yeah, I was also surprised by the instability you find in the ruby interpreters and standard library compared to the Python world.

    I highly doubt a python-based configuration engine will become dominant as chef and puppet have serious critical mass and significant funding. We might have to wait for another paradigm shift for that to take place but maybe by that time we will be scripting with Google Go ;)

    on the bright side, openstack is largely coded in python

    ReplyDelete
  16. The Foreman project from RedHat (similar to Aeulus) is also in Rails.

    http://theforeman.org/projects/foreman/wiki/ScreenShots

    ReplyDelete
  17. @Bryan

    I know what you mean but I know of many sysops would embrace a 'pyppet' with open arms if it did what they wanted, loyalty doesn't seem too high.

    ReplyDelete
  18. I fine hard to buy yogur arguments. It seems to me that "your" world is shifting to ruby, not everybody's one.

    ReplyDelete
  19. @cowmix: I thought that bcfg2 was written in perl actually

    ReplyDelete
  20. I'm a ruby programmer and I would never code ruby as you described.
    ie. this
    ruby -ne 'puts $_ if $. <= 10 ' < /path/to/your/file

    I would have written that way:
    ruby -e "puts STDIN.readlines.first(10)" <

    Which if far more readable than Python version

    ReplyDelete
  21. @Ramon

    bcfg2 is written in python...

    ReplyDelete
  22. Ruby and Python break backward compatibility too often even with minor language version upgrade.
    It's harder to maintain working language version across servers than do real sysadmin tasks.

    Perl rocks and robust to that.

    See Perl equvalent to Ruby's Capistrano/Puppet/Chef.. or Python's Fabric/func..

    Rex - http://rexify.org/

    ReplyDelete
  23. Anything that disses Python makes me happy, Perl still rules the sysadmin world, and everyone just needs to grow up and deal with that fact.

    ReplyDelete
  24. ruby -ne 'puts $_ if $. <= 10 ' < /path/to/your/file
    ->
    ruby -pe 'exit if $. > 10' /path/to/your/file

    But of course we would use "head -10", wouldn't we?
    :-)

    ReplyDelete
    Replies
    1. Just "head" would suffice here, please. 10 lines are default for head and tail.

      Delete
  25. One minor note - I think the python example for reading the first 10 lines of a file could be a bit simpler:

    python -c "import sys; sys.stdout.writelines(sys.stdin.readlines()[:10])" < /path/to/your/file

    ReplyDelete
  26. Your ruby code looks like like bad perl code because you are using the ugliest ruby code I have ever seen.

    Comparing bad ruby code to idiomatic python code is just an unfair comparison.

    I have been a professional ruby programmer for five years, and I don't even know what any of those "$-symbol" variables even mean. I mean this is just hideous, and I had to look up what $_ meant to even figure out what it's doing:

    `dmidecode`
    if $_ ~= /vmware/i
    puts 'this is a vmware vm'
    else
    puts 'this is not a vmware vm'

    I've never seen any ruby programmer write anything like that. It should look something like this:

    puts `dmidecode` =~ /vmware/i ? 'this is a vmware vm' : 'this is not a vmware vm'

    or maybe

    vmware_vm = `dmidecode` =~ /vmware/i
    puts "this is #{'not ' unless vmware_vm}a vmware vm"

    Both considerably easier to understand than the python, IMO. I could go write readable alternatives for your other ruby nightmares, but all those $ symobls hurt my eyes. A good programmer would not have written them like that to begin with.

    Again, I'm a professional ruby programmer and can't even remember the last time I've used a $ in my code. Come to think of it, I don't think I ever have.

    ReplyDelete
    Replies
    1. > Your ruby code looks like like bad *ruby* ...

      FTFY.

      Delete
  27. Hey man, you forget about OpenNebula IaaS stack, that is written in Ruby.

    It is better than OpenStack and used in real projects by serious organizations. Also OpenNebula is supported by some international grant programs.

    Ruby wins. :)

    ReplyDelete
  28. @openid

    Are you serious? OpenStack has way more supporters then OpenNebula.

    http://www.openstack.org/community/companies/

    Some additional tools in the python toolbox:

    Graphite
    Ganglia
    Zenoss
    Lettuce (Python's answer to cucumber) - http://lettuce.it/

    ReplyDelete
  29. Hi, Bryan!

    I’m the web editor at iMasters, one of the largest developer communities in Brazil. I´d like to talk to you about republishing your article at our site.

    Can you contact me at rina.noronha@imasters.com.br?


    Bests,
    Rina Noronha
    Journalist – web editor
    www.imasters.com.br
    redacao@imasters.com.br
    rina.noronha@imasters.com.br
    +55 27 3327-0320 / +55 27 9973-0700

    ReplyDelete