Friday, October 24, 2008

Verbatim: How To Find Patterns In Code With Towelie



Dude, 140 characters? You're insane. It's not possible. Besides, I have Twitter on a temporary hosts-file ban.

First off, your lowest-hanging fruit is Jakob Dunphy's Crufty:

http://github.com/jdunphy/crufty/tree/master

Second off, if you're feeling brave, check out Towelie's MVC branch:

http://github.com/gilesbowkett/towelie/tree/mvc

It's kind of on hold right now, I've been doing a bunch of Archaeopteryx stuff for RubyConf, but you should be able to do something like this:

<macbook of doom:giles> [10-23 13:27] ~/programming/web_apps/your_app
! irb -r ../towelie/lib/towelie.rb
>> @t = Towelie.new
>> @t.parse "app/models"
>> puts @t.duplicates


That'll get you a list of all exact duplicates.



(The only reason @t's an ivar is I have a lot of one-character aliases in my IRB, so I always make one-character demo vars in IRB into ivars to be safe.)

Parsing for patterns is harder because that's a relatively abstract request. Towelie has some diff functionality which allows you to specify the number of nodes by which a method definition differs. You can use that to find patterns in method definitions, but only in method definitions, and it could be a little involved.

The MVC branch is UI improvements but added complexity. Really the diff stuff should live in its own branch. I fucked that up. So you have to do this:

>> @m = Model.new
>> @v = View.new
>> @m.parse "app/models"
>> @m.diff(1).each {|different| puts @v.to_ruby(different)}


Unfortunately you'll notice that the duplicates stuff gives you filenames as well as the methods themselves, while the diff output is less readable. No filenames and things aren't correlated to indicate what they're different from. That's because I set up a cool ERB template for duplicates but diff isn't ready on that yet. Actually now that I think about it, it might have a regular diff in master. And I might have an incomplete thing set up to collect that filename info, too. I don't remember now.

In theory, btw, it should be possible to enable some pretty cool ASCII colors BS via ERB, but my ultra-naive first attempt went kaboom and I've not had the time to take Wirble apart and remind myself how ASCII colors work. In fact technically I'm pretty sure that Wirble's ASCII strategy, plus Jamis Buck's Syntax gem, mean that one day I may actually have ASCII syntax coloring built into Towelie. Still way off in the happy land of vaporware, though.

Anyhoo, obviously, you can scale diff(1) up to diff(x), although it might set your machine on fire, and I think the specs only test it as far as diff(2) or something.

If you're going to fuck around with the diff, it's actually quite powerful, but that power is damn near impossible to see. I would very strongly advocate messing with the code a little bit, taking some time to hack up a decent UI in ERB, just sorting the elements in diff(x) into groups so you can see what you're diffing. It's absolutely possible to isolate the different elements in a diff, i.e., "xyz" differs from "xy" by "z," but it's not in there yet.