Tuesday, April 19, 2016

Rails, RSpec, Poems, And Synthesizers

I've been re-watching Gary Bernhardt's classic series of screencasts Destroy All Software, in part because I'm eagerly anticipating the new edition, Destroy All Software: Civil War. In this edition, David Heinemeier Hansson will face down Bruce Wayne, and everybody will have to pick a side. I'm really looking forward to it. I think, also, that Luke turns out to be his own cousin, or something, but a) I think that's just a rumor, and b) if you know, don't tell me, because spoilers.

Anyway, there's a screencast which covers conflicts between the existing "rules" of object-oriented programming, specifically, inherent conflict between Tell Don't Ask and the Single Responsibility Principle. I'm into this topic, because my book Rails As She Is Spoke is mostly about similar conflicts.

One interesting thing that comes up in this screencast, mostly in passing, is that Rails enthusiastically embraces and encourages Law of Demeter violations. In fact, if you build Rails apps, you've probably seen a line of code like this now and then:

@user.friends.logged_in.where(:last_login < 4.days.ago)

This code subordinates the Law of Demeter to the Rule Of Thumb That Rails Code Should Look Like Cool Sentences. Lots of other things in Rails reveal this same prioritization, if you look at them closely. In fact, when Mr. Hansson wrote a blog post about concerns, he explicitly stated it:
It’s true that [avoiding additional query objects in complex model interactions] will lead to a proliferation of methods on some objects, but that has never bothered me. I care about how I interact with my code base through the source.
It's extremely tempting to laugh this off. "Wow, this guy prefers pretty sentences to considering the Law of Demeter, what a n00b." And I am definitely not going to endorse that blog post, or the idea of concerns, across the board. But I also think laughing off DHH's priorities here would be a mistake.

Consider RSpec, for the sake of comparison. RSpec prioritizes a sentence-y style of code, which tries hard to look like English, over just about any other consideration, as far as I can tell. And RSpec has an Uncanny Valley problem. This code has both an Uncanny Valley problem, and a Law of Demeter problem:


By contrast, it's very interesting that Rails only has Law of Demeter problems, when it does the same kind of thing. The Rails valley is not uncanny at all. When it tries to make Ruby look like English, it stops a little earlier than RSpec does, acknowledging the fakeness and the Ruby-ness of the "English," and in so doing, you end up with code which is English-like enough to be incredibly convenient and easy to read, but not so overly-trying-to-be-English that you can't reason about its API and are forced to memorize everything instead.

Rails encourages specific Demeter violations as a set of special, privileged pathways through unrelated objects and/or objects which exist only to serve as those pathways in the first place. And it works. I'm not saying Rails is perfect — if you've read my book, or indeed ever read anything I've written about Rails since about 2011, then you know I don't think that — but I don't think its cavalier attitude towards the Law of Demeter would even make it onto a top ten list of things I want to change about Rails.

Of course, the whole point of that screencast I mentioned, which points out that the "rules" of OOP conflict with each other from time to time, is that these rules are not rules at all, but merely guidelines. So it's no surprise that they involve tradeoffs. What is surprising is that I don't think there's any real name for what Rails chooses to prioritize over Demeter, except perhaps "readability."

Frankly, it's moments like this when I feel privileged to have studied the liberal arts in college, and where I feel sorry for programmers who studied computer science instead, because there's no terminology for this in the world of computer science at all. Any vocabulary we could bring to bear on this topic would be from the worlds of literature, poetry, and/or language. I know there's a widespread prejudice against the liberal arts in many corners of the tech industry, where things like literature and poetry are viewed as imprecisely defined, arbitrary, or "made up," but every one of those criticisms applies to the Law of Demeter. It's not really a law. It's just some shit that somebody made up. Give credit to the poets for this much: nobody ever pretended that the formal constraints for haikus or sonnets are anything but arbitrary.

Let's look again at our two lines of example code:

@user.should_receive(:foo).with(:bar).and_return(:baz).once # no
@user.friends.logged_in.where(:last_login < 4.days.ago) # ok

If you were to write one of these lines of code, it would feel like you were writing in English. The other line could function as an English sentence if you changed the punctuation. But what's interesting is that these two statements don't apply to the same line.

This one feels harder to write, yet it functions almost perfectly as English:

# "user should receive foo, with bar, and return baz, once"

Writing this one feels as natural as writing in English, but falls apart when treated as English:

@user.friends.logged_in.where(:last_login < 4.days.ago)
# "user friends logged in where last login less than four days ago"

These are extremely subjective judgements, and you might not agree with them. Maybe the RSpec code isn't such a good example. What I find difficult about RSpec is remembering fiddly differences like should_receive vs. should have_selector. I'm never sure when RSpec's should wants a space and when it wants an underscore. Why is it should have_selector, and not should_have_selector? Why is it should_receive, and not should receive? RSpec has two ways to connect should to a verb, and there doesn't seem to be any consistent reason for choosing one or the other.

In actual English grammar, there are consistent connections between words, whereas with RSpec, you kind of just have to remember which of several possible linkage principles the API chose to use at any given moment. To be fair, English is a notoriously idiosyncratic language full of inconsistencies and corner cases, so writing RSpec might actually feel like writing English if English isn't your first language. But English is my first language, so for me, writing RSpec brings forth a disoriented sensation that writing English does not.

(Tangent: because I'm a first-generation American, and England is "the old country" for me, English is not only my first language, but my second language as well.)

Anyway, the question of why Rails feels more natural to me than RSpec — and I really think it's not just me, but quite a few people — remains unanswered.

There is another way to approach this. This is an analog synthesizer:

These machines have a type of thing called envelopes.

Briefly, an envelope is way to control an aspect of the synthesizer. This synth has one envelope for controlling its filter, and another for controlling its amp (or volume). It doesn't matter right now what filters and amps are, just that there are two dedicated banks of sliders for controlling them. Likewise, it doesn't matter how envelopes work, but you should understand that there are four controls: Attack, Decay, Sustain, and Release.

Now look at this synthesizer:

The envelope controls are much more compact:

This synthesizer again has one envelope for its filter, and another for its amp. But this synthesizer wants you to use the same single knob not only for each envelope, but even for each of the four parameters on each envelope. Where the other machine had eight sliders, this machine has one knob. You press the Attack button in the Filter row to make the knob control the filter attack. You press the Release button in the Volume row (as pictured) to make the knob control the amp release. (And so on.)

Do hardware engineers have a word for this? If they do, my bad, because I don't know what it would be. User experience designers have a related word — affordances, which is what all these controls are — but I don't know if they have a word for when you dedicate affordances on a per-function basis, vs. when you decide to double up (or octuple up). It is, again, a tradeoff, and as far as I can tell, it's a tradeoff without a name.

But it's the same basic tradeoff that Rails and RSpec make when they pretend to be the English language, and somehow, Rails gets this tradeoff right, while RSpec gets it wrong. When I need to recall a Rails API which mimics English, it's easy; when I need to recall an RSpec API which mimics English, there's a greater risk of stumbling. With should_receive vs. should have_selector, the relationship between the API's affordances and its actions is out of balance. RSpec here has the opposite problem from the synthesizer with one knob for every envelope parameter. Here, RSpec's got an extra affordance — using an underscore vs. using a space — which has no semantic significance, but still takes up developer attention. It's a control that does nothing, but which you have to set correctly in order for the other controls to not suddenly break. Rails, by contrast, has a nice balance between affordances and actions in its APIs.

Sunday, April 10, 2016

The Fallacies Of Distributed Coding

If you only ever write code which runs on one machine, and only ever use apps which have no networked features, then computers are deterministic things. It used to be a given, for all programmers, that computers were fundamentally deterministic, and thanks to the internet, that just isn't true any more. But it's not just the rise of the internet, which its implicit mandate that all software must become networked software, which has killed the idea that programming is inherently deterministic. Because everybody's code became a distributed system in a second way.

If you write Ruby, your code is only secure if RubyGems.org is secure. If you write Node.js, your code is only secure if npmjs.com is secure. And for the vast majority of new projects today, your code is only secure if git and GitHub are secure.

Today "your" code is a web of libraries and frameworks. All of them change on their own schedules. They have different authors, different philosophies, different background assumptions. And all the fallacies of distributed computing prove equally false when you're building applications out of extremely modular components.
  1. The network is reliable. This is obviously a fallacy with actual networks of computers, but "social coding," as GitHub calls it, requires a social network, with people co-operating with each other and getting stuff done. This network mostly exists, but is prone to random outages.
  2. Latency is zero. The analogy here is with the latency between the time you submit a patch and the moment it gets accepted or rejected. If you've ever worked against a custom, in-house fork of a BDD library whose name.should(remain :unmentioned), because version 1.11 had a bug, which version 1.12 fixed, but version 1.12 simultaneously introduced a new bug, and your patches to fix that new bug were on hold until version 1.13, then you've seen this latency in action, and paid the price.
  3. Bandwidth is infinite.
  4. The network is secure. Say you're a law enforcement agency with a paradoxical but consistent history of criminality and espionage against your own citizens. Say you try to get a backdoor installed on a popular open source package through legal means. Say you fail. What's to stop you from obtaining leverage over a well-respected open source programmer by discovering their extramarital affairs? I've already given you simpler examples of the network being insecure, a few paragraphs above. I'm hoping this more speculative one is purely hypothetical, but you never know.
  5. Topology doesn't change.
  6. There is one administrator.
  7. Transport cost is zero. Receiving new code updates, and integrating them, requires developer time.
  8. The network is homogeneous.
Open source has scaled in ways which its advocates did not foresee. I was a minor open source fan in the late 1990s, when the term first took hold. I used Apache and CPAN. I even tried to publish some Perl code, but I was a newbie, unsure of my own code, and the barriers to entry were much higher at the time. Publishing open source in the late 1990s was a sign of an expert. Today, all you have to do is click a button.

The effect of this was to transform what it meant to write code. It used to be about structuring logic. Today it's about building an abstract distributed system of loosely affiliated libraries, frameworks, and/or modules in order to create a concrete distributed system out of computers sending messages to each other. The concrete distributed system is the easy part, and people get it wrong all the time. The abstract distributed system is an unforeseen consequence of the incredible proliferation of open source, combined with the fact that scaling is fundamentally transformative.