Monday, May 5, 2008

Build Rails Apps Which Build Rails Apps For You

Programmatically Generate Enterprise Integration APIs

Many sites, including my current project, use Omniture for page tracking and analytics. In practical terms, this means somebody has to build JavaScript identifying each page in Omniture's terms, which can involve a mind-numbing amount of repetitive, beauraucratic code.

If you take a look, you'll see that many of these variables have identical values, and few of them have semantically significant names. Nobody who's worked in technology for any length of time will be surprised to learn that this combination resulted in tedious, tangled code on the Rails side, specifically in the helpers, or that the Rails helpers which generated this JavaScript from within various Rails apps all operated more or less independently, without a great deal of co-ordination, leading to an inevitable maintenance nightmare.

The temptation here is to complain, read Slashdot, and put Dilbert cartoons on your cubicle wall. But it doesn't have to be that way. There's a saying among actors: "there are no small parts, just small actors." Any time an actor complains he or she has a small part in any given production, that's the correct answer and the only answer. Maybe there's an equivalent for programmers: "there are no boring programs, just boring programmers."

One way to make a boring problem interesting is to solve it forever, and in every way, simultaneously. But if you want to solve something forever, the first step is to solve it once. Therefore the place to start is with a simple solution, and then find a way to generalize that solution.

Our existing Omniture code did not have the virtue of simplicity. The first step was to find a way to make it simple. This was easy, because I've read the Gang of Four and I know my design patterns.

As an aside, I think the Rails community's scorn for the enterprise is pretty fair, but what seems much less fair is its scorn for design patterns. A partisan attitude to programming languages often involves a knee-jerk tribalism. Design patterns are bad. Why? Because Java programmers like them. This passes for logic in the more ghetto sectors of the Rails metropolis.

The simplest possible solution to this problem is the Data Object, which is simply an object representing data. (It's funny how many books you need to read before you start to program in the simplest way possible.) Instead of generating all this stuff inside the template, first you put it in an object:

And then you have a much smaller block of code to deal with in the template:

Your preparation in the controller is tiny as well:

(The send and respond_to? stuff is in there because different pages for Omniture need to report different variables.)

This was a massive improvement over the old system - the code is easier to use and easier to maintain. (The only confusing bit which remains, the stuff in the template, could easily turn into a helper with some quick refactoring.) So, the first half of the mission - build something simple which works - is accomplished. Now for the second part: generalizing to every application this company has.

That's where VisiJax comes in.

The next step was to build a simple spreadsheet editor in Rails with Scriptaculous. This took a couple days, and I open-sourced it, mainly so anybody curious could follow along with the code. (Side note, I was pressed for time; otherwise, this would have been a great excuse to explore jQuery, which Yehuda Katz completely sold me on with his presentation at the Philadelphia Emerging Tech for the Enterprise conference.)

Anyway, I built a spreadsheet editor to match the pre-existing workflow. In this situation, the mapping of variables to values comes to the programmer from the business analytics team, and it comes in the form of an MS Excel spreadsheet. The concept of objects mapping to page types is actually implicit in the spreadsheet format:

The easiest way to build an elegant interface is to build an interface that doesn't appear to be an interface at all, which means the easiest interface to turn a spreadsheet into a code is an interface where you appear to just turn a spreadsheet into another spreadsheet.

At this point we have a consistent pattern of code which our data should result in, and a simple interface to capture it. Hopefully it's obvious where this is going: the last and final step was an ERB template and code to use it.

Running this code against the spreadsheet in your Rails app's DB will produce objects that look like this:

(One detail I didn't go into: the system knows to capture variables identified $like_this$ in the spreadsheet and turn them into options hash retrieval in the initialize methods.)

So there you have it - a complete workflow and a robust solution all in one. You get a spreadsheet from the business analytics team, you copy it into your code generator Rails app's spreadsheet, running on your local machine, and you plug the files that Rails app generates into the server-side Rails app you're developing. Then you edit a few controllers, maybe write a few helper methods, and you're good to go.

You might wonder why, if I'm storing all this information in a database anyway, I didn't just make one Omniture control app to sit on a server and operate as a Web service, rather than building a code generator to run on developers' local machines. The answer's in the economics. Centralizing the information in a Web service makes it expensive to change; many things would be dependent on such a system and the overall effect would reduce flexibility. Dropping it on a developers' local machine means the app becomes a throwaway app, and if it needs to change later, it can. The reasons were economic from another perspective as well: working with the separate corporate team in charge of servers adds some overhead, while creating a DB in MySQL on your local box is a two-minute task. You always get more leverage when you make essential operations cheap.

In fact, this kind of organizational overhead is a motivator in the design of the user interface as well, and in the choice of code generation as a strategy. Previously, for a programmer to implement Omniture integration on this company's various web apps, the developer had to understand Omniture integration. However, although understanding is generally a valuable thing, in this context it's a serious waste of time and money. Omniture integration is highly specialized knowledge. It doesn't extrapolate or generalize to anything else Rails developers do; due to its proprietary nature, it can change unpredictably; and it may or may not be useful in future for the types of jobs Rails developers want to have. For many developers, there's zero value in understanding this information, and there's actually a similar downside to the corporation as a whole for the Rails developers to attempt to understand this information, since any developer who does so will come to their own slightly distinct conclusions about it, which means they'll clutter the code base with countless variations on the same basic semantics.

More simply, the Rails devs just don't need to know about Omniture. The only thing they need to know is what variables should have which values in the JavaScript for Omniture which their Rails apps generate. The business analytics team prepares spreadsheets defining their Omniture implementations, but then needs to have meetings to explain these spreadsheets to people. Given that the whole process of attempting to understand the spreadsheet is purely wasted effort, both from the perspective of the individual developer and from the perspective of the corporation as a whole, a process which simply takes a spreadsheet as input and produces viable code as output delivers the required result while introducing regularity to the process and simultaneously removing monotony.

Code generation gives you terrific advantages in productivity. Meetings waste everyone's time, so streamlining corporate processes makes life smoother for everyone. Mind-numbing repetition is for computers to handle, and when a corporation gives you mind-numbingly repetitive programming tasks, the correct response is not just to do those tasks, but also to write a program to do those tasks for you, so that nobody will ever have to do those tasks by hand ever again, even if those tasks are in fact programming tasks. This is what metaprogramming really means.

Why write programs when you can write programmers?