Tuesday, April 3, 2007

Tiny APIs

I've blogged before about building mini-APIs into Rails. I'm not the only one -- Jamis Buck wrote a post praising Marcel Molina, Jr. for his talent at doing the exact same thing. I think this is a useful technique which Ruby programmers need to use more often.

I think the reason you don't see people using mini-APIs as often as they should is because people don't really understand why it's a good idea, and where you should use it.

Mini-APIs come midway between DSLs and refactoring. Refactoring is often used as a synonym for debugging, but refactoring is actually the opposite of debugging. Instead of changing the code to get it to work properly, you're changing the code without changing the functionality at all. Any time you find something repetitive or inelegant, you tidy it up a bit. Just a tiny, tiny bit.

The end result is a cleaner design, but refactoring isn't about redesign. It's about tidying up the code and getting a cleaner design as a side effect. It's actually TDD in reverse. With both refactoring and TDD, your goal is to produce a clean design, but the way you get there is not by designing anything, but rather doing lots of small things which, in the long run, if done consistently, will result in a good design without any explicit designing ever actually taking place. It's a very Zen idea.

DSLs, of course, are all about creating new mini-languages which you or your client programmers can use to write code not in Ruby or in Rails but in the mini-language. Your goal is to represent the problem space so succinctly that a business user can code in terms of business rules without ever needing to learn a fully-fledged language. Rails is a great example of this, and yet all the DSL features of Rails emerged organically from DHH's desire to eliminate repetition -- which is to say, the DSL aspect of Rails is a result not of deliberate design but of consistent and diligent refactoring.

(By the way, I'm sorry I don't have a link to back this up, but I'm absolutely certain of it. I've read a gazillion blogs and listened to every podcast DHH was ever on, and the proof of this statement is definitely out there somewhere. I just don't have the time to track it down at the moment.)

Anyway, if DSLs are really just the product of very extensive refactoring, and mini-APIs sit midway between refactoring and DSLs, what I'm saying is that mini-APIs emerge from streamlining repetitive code. And that's it. That's exactly what happens. If you want to get to a DSL, the way to do it is not to cook up a DSL right away. You start with messy code and you chip away at it until it takes the shape of a DSL -- like a sculptor shaping marble. You'll know you're halfway there when you've got a mini-API.

I think this is why you don't see people using mini-APIs enough. You often see programmers who want to write a big fancy DSL, because that's the hip new thing, and you often see programmers who'll write the same code twice because they're in a hurry. What's rare is the programmer who'll go back and rewrite repetitive code to make it slim and elegant, and what's very, very rare is the programmer who's going to rewrite the same piece of code until it's as elegant as it can possibly be.

Unfortunately this is exactly the type of programming which results in powerful frameworks like Rails! It is very literally exactly the programming method which gave the world Rails in the first place.

DHH has tons of fans. If you're a DHH fan, think about what I'm about to say, and act on it. Nearly every one of DHH's fans is writing code like the code DHH wrote. Very, very few of DHH's fans are writing code the way DHH writes it. But the difference between doing something just like something somebody else did and doing something the way somebody else did it is the difference between being an imitator and being a student.

Don't imitate. Study.


  1. Good information. I'm very interested in DSLs, and I like the idea that you don't have to design them, and just use a process. I've heard of people creating DSLs for customers, but I like creating them for myself, too. I'd much rather focus on the process of what I'm doing, rather than deal with how the computer is going to execute my process. Obviously I have to dig into that, but I can create that layer that allows me to focus on the process.

    I've tried my own hand at creating a DSL (small, obviously). A little design was involved, but what I was really trying to do was simplify an existing API. I was adding a layer of abstraction. I didn't do a thing to the underlying API. The design comes in when you're determining how you want to represent the functionality. I was creating a DSL for screen scraping. I wanted to think of it as a "web query": submit a query, get data back. So I wanted it to be a bit SQL-like. I was writing it in Squeak, and it looked something like this:

    scraper := (WebQuery fromAddress: '(URL)') where: '(parm1)' is: '(value1)';
    where: '(parm2)' is: '(value2)'; yourself.
    page := scraper retrievePage.

    Unless you create a stand-alone DSL that generates its own code, you are programming in some other language. You are using its syntax rules. The difference is how the process is represented. DSLs are all about representation. A while back I found an interview with Alan Kay where he said something like, "The best way to think of a programming language is as a user interface." I totally agree.

    The above example probably wouldn't be enough for a customer. They'd probably want something like this:

    dataSys := DataSystem new.
    tables := dataSys openFinancials; forCustomers: #('Johnny Cash' 'Elvis Presley' 'Mickey Mouse'); asTables.
    tables display.

    This could use what I created, just at a lower level.

    Re: programmers going back to refactor code

    I personally like doing this. The thing is, at least in the business world, this tends to get discouraged because of budget issues. When places use the Waterfall Model (the most predominent model, the best I can tell), programmers really only get one shot to get the code, as a whole, right. After that the attitude from management is, "If it runs, don't mess with it." The one time programmers are allowed a chance to refactor (as you define it) is when they're debugging code, or adding a feature, and even then it's often done surreptitiously--"While I'm at it, I'll just fix this up here." Most places I've seen don't even want to entertain the idea of refactoring in principle. They don't see the point. As far as they're concerned what matters is if the software runs without crashing, and the customer is pleased with it.

    In my whole career, I've gotten one or two chances to really refactor code. In one case I had to do "the hard sell" with my boss. There was some truth behind it. The code as it was was very difficult to maintain. Just about anytime I changed it, it caused the program to crash. I made the argument that it had a real impact on my productivity. I was able to make the code a thing of beauty (for the most part), and make the program a little more robust, but it took me a couple months to pull it off. I was refactoring "cut and paste" code. There was a lot of it. It badly needed to be cleaned up.

    It may have been possible to refactor the thing in small pieces. That might've been a wiser approach. I remember the project got into a "mission creep", because I kept finding code that needed to be fixed up.

    I like refactoring because in the long run it makes programming a more pleasant experience.

  2. Who refers to "debugging" as "refactoring"???

  3. Aren't regular expressions a mini-DSL for matching strings? Isn't SQL a mini-DSL for set operations backed by a data store?

    I'm all for embedding mini-DSLs in your current language. Most people are already doing it and may not realize that yet.

  4. But the difference between doing something just like something somebody else did and doing something the way somebody else did it is the difference between being an imitator and being a student.

    This is really just a bit of silly wordplay designed to give the impression that there's a real distinction between imitator and student. The line is drawn at many points between the two and there's no real criteria with which to judge which is more correct, but what all these schemes have in common is that the speaker, having set up a scheme, is suprised to discover himself and his work on the 'inspiration' side. This all seems just a bit self-serving.

    What are the motivations for coming up with these things? Perhaps to justify hefty paychecks, or to help console ourselves, so we can say that even though we 'borrow' things from others, we're still original! Not like those other thieving bastards. But does it matter? Every good thing that was ever built is, in some way, a collaboration, and we'd be much better collaborators if we stopped obsessing over who's going to get the credit.

  5. @Jeff,

    Regular expressions are a DSL, as is SQL, Make files, IDLs, CSS, etc. Most people are using them, but very few people actually write a DSL to make solving their own domain problem easier. At some point in time, there was a strong backlash against language plurality, and strong preference towards the One Language To Rule Them All. Some even went so far as to abstract SQL away using that One Language.

    Separately, there's a distinction between DSLs and EDSLs. Embedded DSLs are contained inside a host language, they simplify a particular problem domain, but also retain all the capabilities of the hosting language. HTML templating (JSP, PHP, ERB, etc) are a good example for that. Or Rake vs Make.

  6. @anon1 - tons of people. I've worked with lots and lots of people in lots and lots of places. if you've never heard somebody misuse a technical term, especially an abstract one like "refactoring," you're very fortunate.

    @anon2 - how is that supposed to be wordplay? I don't get it. that's one of the clumsiest sentences I've written.

    @mark - I think that's what DSLs are really supposed to be, interfaces to underlying APIs. The budgeting vs refactoring thing, that came up recently on Pat Maddox's blog. I think companies which devoted a certain amount of time every day to refactoring existing code would save money in the long run, but I'm not sure exactly how. I just have a strong intuition that there's a business advantage disguised as a luxury there. (Spotting those things is kind of my new hobby.)

    @jeff & assaf - I think regexes and SQL are actually more embedded languages than domain-specific ones. With those you're going down from the programming language into a smaller world, with a DSL you build the programming language up to operate at a higher level for client programmers. I think.


Note: Only a member of this blog may post a comment.