Sunday, March 9, 2008

Simple Metaprogramming Logger With Graph Generation

Avdi Grimm's objections to monkeypatching overkill, which I've blogged about before, ultimately inspired this code.

Monkey-patching is the new black. It's what all the hip kids are doing, to the point that smart, experienced hackers reach for a monkey patch as their tool of first resort, even when simpler, more traditional solutions are possible.

I don’t believe this situation to be sustainable. Where I work, we are already seeing subtle, difficult-to-debug problems crop up as the result of monkey-patching in plugins. Patches interact in unpredictable, combinatoric ways. And by their nature, bugs caused by monkey patches are more difficult to track down than those introduced by more traditional classes and methods.

Pat Maddox responded to Avdi's post with a self-logging alias_method_chain. I decided to follow Pat's line of reasoning into the wild wild West of Rubinius, and create self-logging versions of Module#include and Object#new. I figured the best way to resolve bad monkey-patches would be to monkey-patch monkey-patching itself.

To run this code, you'll need a copy of Rubinius. I developed it against Rubinius as of March 4th, so it's a few days out of date, but that probably won't matter. It functions as a very simple proof-of-concept for metaprogramming loggers: it logs to STDOUT when modules get new methods, or objects get instantiated. (It should also log when modules are included, but I forgot to do that part.)

Here's some of the code. (Grab the Pastie for all of it.)

This code includes a module on a class, and then calls the mixed-in method. That part's obvious. It shows up on your screen when Rubinius logs it, via a modified version of __add_method__, which is an internals method on Module.

I got the first method, Class#new, from the comments/RDoc in the Rubinius source. I think they actually included it as a joke, or at most a simple demo, but it's fun to actually turn it on and see what it does. The other methods in my logger are equally obvious.

Here's the output.

minimal output, from the command line

verbose output, from within irb

So far, so good. It's pretty trivial to modify Rubinius in such a way that Ruby automatically logs the ongoing internal activities of its class and module system. That's nifty.

Logging really isn't enough, though. These are toy methods, and logs are for computers to read. If we write a slightly more sophisticated system, we can generate object/method graphs semi-automatically.

The first step is to target the right grapher. I'm feeling some peer pressure to use Gruff or Shoes, but the simplest option here, Nodebox, combines Photoshop with Processing, and adds a Python scripting interface.

We're going to need Python output which Nodebox can use to generate a graph. We've already got text output. First, we modify our Rubinius code to change the text output into Ruby output.

Capturing this in an actionable format is trivial.

./shotgun/rubinius nodeboxr.rb > objex.rb

Next we write some additional Ruby code to eval this Ruby code.

This generates Ruby objects. With ERB, it's easy to transform those objects into Python code. Here's the Python to create an utterly trivial graph:

A fully robust solution would include some code to arrange the nodes around the screen. I've played with that kind of stuff in the past, and it's satisfying when it works. Here, though, I'm going to skip that and just target the simplest use case.

Here's the ERB Python template.

Here's the Ruby which populates it.

Calling it, of course, is trivial.

./shotgun/rubinius templater.rb >

Next step: go into Nodebox, open the Python file, and hit command-R to run the code. We've already seen what that looks like; the only difference here is the name of the file in the window's title bar.

It's probably worth mentioning that Nodebox can generate graphs dramatically more complex than this.

superfolia, written in nodebox python

Obviously, our object graph is primitive by comparison. That's fine. Pretty much every step of this post addresses simplified versions of the real problem space, and if you look it over you'll spot some extremely blatant cheating. That's kind of the point. I wrote this code in a day, it's all pretty trivial, and it visualizes information which people were claiming an entire programming philosophy irretrievably obscures.

If you have incredibly powerful tools, you can do a lot with very little effort.

δος μοι πα στω, και ταν γαν κινάσω
"give me place to stand, and all Earth I will move"

Code generation's an incredibly powerful tool; John McCarthy's fundamental insight that code is data remains equally powerful whether you leverage it from within Lisp or across a Unix filesystem. Rubinius is also incredibly powerful; if modifying classes and modules in real time is metaprogramming, then modifying the systems which enable that flexibility must be metametaprogramming. Whatever the term, when we combine an excellent strategy like code generation with a turtles-all-the-way-down language implementation like Rubinius, we can create very powerful analysis tools very quickly.

King Yertle