Thursday, April 11, 2013

Rewind: Analyzing Git History With Bash

Rewind is a small library of git analysis scripts in Ruby and bash. Its goal is to quickly extract meaningful context from the enormous amount of historical data which any git project provides.

One use case: you want to compare the respective authorship patterns of two forks of a library from GitHub. Maybe your company has a local gem forked from a popular gem, and you want to figure out how many unique changes your fork has.

Another use case: you're looking at a large library with a lot of files, and you've been told this library has a lot of technical debt. One way to track down technical debt is to find the files which a) have the largest number of lines of code, and b) have the largest number of commits. Conversely, if you see a very small file with a very large number of commits in its history, people have probably refactored that file a lot.

Rewind gives you numbers; you have to use good judgement to get useful insights from those numbers.