Thursday, July 30, 2009

Iron Man Perl, redux.

Badges? We ain't got no badges. We don't need no badges! I don't have to show you any stinkin' badges!

Matt Trout writes in his blog about almost being done with the Iron Man Perl Judging software. He wrote his beautiful modern perl code to pass a suite of tests. But when he put it all together, the system didn't work. So he put out an open call to find out what's up.

This seemed like a good opportunity to do something useful for the community. Also, a chance to read some modern, moose-y perl, and do my first git checkout. Oh, and do some debugging.

I spent much of Saturday afternoon hitting "Y/Ret" in cpan to get my 5.10 debian install up-to-date with the moose core. Once I got my system up-to-date, I started running his tests. Sure enough, most of them passed. One of them didn't clean-up well so failed on subsequent runs. But that was minor.

I started by digging into the heart of the matter. The test calculate.t that verified the date calculation routines. Why did the tests pass and not with the demo data later via plagger_loader.t? I squirelled through the rest of the code, squinting at all the new funny moose bits and got a overall view of the plan.

By now it was late Saturday night, and I was having trouble sleeping. My mother-in-law was visiting, so we'd given her the bed and I was on the couch in the living room. By Santa Monica standards it was super hot (maybe 80F). Since I couldn't sleep, I pulled the laptop back out and jumped back to check one more hunch: "what is the difference between the data for calculate.t and plagger_loader.t?"

AHA! The age-old problem of real world data not being nearly as pretty as test data. All of the post data created in calculate.t was neatly sorted by increasing age. The csv data, plucked from real logs, was a hodgepodge of sorting. Maybe it was alphabetical, but it sure wasn't reverse chronological.

A quick change was called for: Sort the data by reverse date. Where to slip this into the API? 1) Sort the post order when reading the CSV in plaggerloader? 2) sort the post array in Calculate.pm, either in check_both or check. Which change matches the intentions of the API designer? I don't know, that's up to you, Senor Trout.

This experience gave me my first interaction with git. See the next post for my blog entry on cloning remote read-only git to github.

These changes are now up on github.
http://github.com/spazm/Iron-Munger/tree/master

In the mean-time, here's the ghetto-diff version:

PlaggerLoader.pm
Synopsis: add sort { $b->at <=> $a->at }

method _expand_posts_from_file(IO::All::File $file) {
    return [
      sort { $b->at <=> $a->at }
      map $self->_expand_post($_),
        @{$self->_expand_postspecs_from_file($file)},
    ];
  }
Calculate.pm
Synopsis: add my @sorted_posts = sort {$b->at <=> $a->at} @posts; and call check on @sorted_posts instead of @posts.
sub check_both ($check, @posts) {
  my @sorted_posts = sort {$b->at <=> $a->at} @posts;
  return min(
    $check->(1, 10, @sorted_posts), # 10 days between posts
    $check->(4, 32, @sorted_posts), # 4 posts within any given 32 days
  );
}

There is something very satisfying about spending a day or more debugging in a process that eventually ends with adding five lines to the primary code base. It's a feeling of having cut through the accidental complexity to the heart of the matter. Finesse vs Force. Good thing I don't get paid by the KLOC.

Hope this helps! Now when do we get our badges?

1 comment:

Phillip Smith said...

Hey Andrew,

Many thanks for taking that challenge on. I was wondering when the Ironman "score board" might show up!

Phillip.