Thursday, May 24, 2007

SQL Unnecessary In Haskell's HAppS

Evan Weaver has this idea: SQL is a kludge. It's a weird idea, that SQL isn't necessary in Web apps and doesn't belong in Web apps, but Dave Thomas and Avi Bryant both hinted at it in their RailsConf keynotes, David Heinemeier Hansson basically used it as the base assumption for ActiveRecord, and, according to my roommate Justin, Paul Graham used it as a base assumption as well in the original Lisp version of ViaWeb, his startup which became Yahoo! Stores.

So, I think this idea's a good idea. So I've been telling it to people. Many people who hear this idea look at me like I'm crazy, but the few people I know of who share the idea are all famous and brilliant, except for Evan Weaver, who is merely brilliant.

Just today I found out another place this idea gets taken seriously: HAppS - the Haskell Web app framework (and server), which will hereafter in this blog be referred to as Happs, because come on, you'd have to be deliberately intent on wrecking a good idea with bad marketing if you were seriously considering using the official capitalization. Anyway, Happs does away with the database completely. Not only that, it does away with the Web server as well. The Happs philosophy says a server should be built in.

Happs Web apps compile to binaries, which store serialized state in memory and on the filesystem. State is written to the filesystem first and stored in memory second. This, of course, enables failover. You'd think all that I/O might mess with performance, but the Happs response is interesting:

I am not saying that using HAppS, you could serve all of eBay on a single box. I am saying that your application is likely to be well within the constraints required

The reason they're not saying you could serve eBay from a single box using Happs is because some of the examples on the page indicate that maybe you almost could.

This is all new, I haven't verified it yet, and I've never met a developer who didn't overestimate their code in some respect - it's like meeting a new mother who doesn't think her baby is beautiful - but it looks very, very interesting. If you're into the idea that SQL is a kludge, check out Happs.


  1. I don't think the idea that SQL is a kludge is a new idea. If I have to do metaprogramming (because, in a very real sense, that's what it is) by manipulating strings, that is definitely a kludge.

    I have cursed SQL's insanity for years, wondering why they didn't just write a lisp-y thing instead.

  2. SQL is a kludge , gotta agree with you there.

    But also because I am so bad at writing complex queries I somewhat welcome it's social demise.

    Although we may be in the minority on this one so far.

  3. I can't be bothered finding the relevant essay, but Paul Graham does say that he didn't use a database at all for Viaweb; he stored everything in files.

    Personally I've always seen SQL as an attempt to express the relational calculus in COBOL.

  4. Personaly i come to such idea half year ago but there is another point. Using SQL automaticaly allow you to scale. You roll web-crunching slaves and they all connected to db via 1gb network. Dumb way but dumbness is a good thing here. And it works (till certain point).

  5. I like the idea of ditching the db, and have been exploring that for a while now. I have found that simple arrays and hashes do a wonderful job of keeping track of things.

    Scaling sessions across web servers wouldn't be difficult, as you could easily do that with a load balancer w/ affinity.

    But how do you effectively share the backend *data* between servers? That's where SQL servers always come in handy. I'd love to hear more about how others would handle this.


    What database did you use?

    We didn't use one. We just stored everything in files. The Unix file system is pretty good at not losing your data, especially if you put the files on a Netapp.

    It is a common mistake to think of Web-based apps as interfaces to databases. Desktop apps aren't just interfaces to databases; why should Web-based apps be any different? The hard part is not where you store the data, but what the software does.

    While we were doing Viaweb, we took a good deal of heat from pseudo-technical people like VCs and industry analysts for not using a database-- and for using cheap Intel boxes running FreeBSD as servers. But when we were getting bought by Yahoo, we found that they also just stored everything in files-- and all their servers were also cheap Intel boxes running FreeBSD.

    (During the Bubble, Oracle used to run ads saying that Yahoo ran on Oracle software. I found this hard to believe, so I asked around. It turned out the Yahoo accounting department used Oracle.)

  7. I don't see the idea of "ditching a DB" and "ditching SQL" as intrinsically linked concepts. Look at MNesia for Erlang, for instance.

    Right now people are getting around SQL by using elaborate libraries that turn programmer idiom in a given framework into SQL. In a sense, this achieves the desired goal of letting one use persistence without the kludge that is SQL. I only wish that the destination language was more sane.

    A long time ago a project needed the ability to issue arbitrary boolean queries to a simple SQLite db on disk, so I wrote an abstraction to support this that took the form of a tree that could be easily composed. This nicely avoided string hacking for SQL almost everywhere else in the system, and the rest lived in its own dirty module of string-hackery.

    What is sad about this story is that I had to write such an abstraction at all, or had to take it to the level of generating a raw SQL string. The ability to hand a tree decorated with symbols and data to a database system and get a result should be something that is more readily available in general.

  8. By the way - somebody posted a rude comment with an interesting link. I deleted the comment, but here's the link. It concerns OO from a DBA's perspective.

  9. MNesia - that's a good point. That means ErlyWeb probably disregards classic (old-school?) DB strategies also.

    More details: Mnesia is a distributed DataBase Management System, appropriate for telecommunications applications and other Erlang applications with need of continuous operation and soft real-time properties.

    Apparently you can put the data in memory, in files, or both.

  10. Giles, I was reminded of this, did you see it yet ? Chet Murthy's talk (via Bill Clementson's blog) on collapsing some of the layers in enterprise apps through functional programming.

  11. I've been experimenting with this idea in Squeak, in terms of persisting data in memory, and it's quite easy to do. Ramon Leon demonstrated it in his "Build a Blog in 15 minutes" screencast. You just put a reference to your data structure in a class object, and you're done. The thing is this works best if your data interaction is mostly read-only. Files work better for read & write-intensive apps, so long as the files are shared. RDBMS's are still good for their indexed search abilities, and their ability to make data modification operations atomic. With files, yes you can do pattern-matching, but that still means the program has to do a line-by-line search. Indexed databases don't have to do this. Memory would be faster, and I imagine that hashes do basically the same thing that RDBMS's do with their indexing function.

    SQL is a kludge the same way that using Perl, Python, Lisp, or some other dynamic language from a static language is a kludge. There are some skilled developers who do this. You're forming programmatic statements out of strings, passing it to an interpreter, which executes it, and then you get a result back. It's better in terms of code readability if you can get, or build up, a DSL native to the PL you're using that hides this from you, like Rails does. But at the lowest level, what's the alternative to SQL when it comes to using an RDBMS? I've worked with relational database APIs in the past, and I'd take SQL any day over them. SQL expresses what you want to do with a database better than an API does.

    I think the future is the data access DSL--bringing direct data access into the language itself, with SQL being used for obscure situations where the DSL doesn't suffice.

  12. Zope DB (ZODB) and Durus - two Python object databases - have been around for some time now, the former for quite some time indeed, the latter is a simpler re-implementation of the former.

    I like to say both are the database Python programmers already know, since we "store" data in the OODB using familiar Python constructs - mostly lists and mappings.

    By inheriting or building upon a few basic classes (PersistentObject, PersistentList, PersistentDict, BTree) you can build simple or complex data structures where all attributes are persistent (or even selectively so). No fighting with un-pythonic idioms or ORM's that get in the way.

    I found that the transition from thinking relationally to thinking pythonically (?) was the primary barrier to becoming productive but that period did not last

    These days I do not automatically reach for Postgres or Oracle or SQL Server to build solutions. Durus gets first crack. Not having to deal with ORM layers, nor SQL, is a liberating treat. Of course some applications and business situations do scream out for SQL / RDBs but many do not.


Note: Only a member of this blog may post a comment.