Friday, October 8, 2010

GRT : Guttman Rosler Transform

The Guttman-Rosler Transform is a technique from Uri Guttman and Larry Rosler for improving sort speed in perl.

The Guttman-Rosler transform runs faster than the orcish or Schwartzian transforms by avoiding the use of custom search subroutines. This is accomplished via pre and post transformation of the list. This allows the native optimizations of the perl sort function to shine.

Sort::External has a special affinity for the GRT or Guttman-Rosler Transform, also known as the "packed-default" sort. The algorithm is examined at length in "A Fresh Look at Efficient Perl Sorting", a paper by Uri Guttman and Larry Rosler, located as of this writing at http://www.sysarch.com/perl/sort_paper.html.

For many applications, the GRT is the most efficient Perl sorting transform. This document explores how to use it to solve common coding problems in conjunction with either Perl's built-in sort() function or Sort::External.

-- Sort::External::Cookbook

Synopsis:
  1. prefix each element with a synthetic key (string or numeric).
  2. sort.
  3. remove the prefix key from each element.

Example:

    
    my %colors = (
        "Granny Smith"     => "green",
        "Golden Delicious" => "yellow",
        "Pink Lady"        => "pink",
    );

  #A) standard sort:
    my @sorted_by_color = sort { $colors{$a} cmp $colors{$b} } keys %colors;

  #B) GRT sort:
    # 1. Encode
    my @sortkeys;
    while ( my ( $variety, $color ) = each %colors ) {
        push @sortkeys, sprintf( "%-6s%-16s", $color, $variety );
    }
    # @sortkeys = (
    #     "green Granny Smith    ",
    #     "yellowGolden Delicious",
    #     "pink  Pink Lady       ",
    # );

    # 2. Sort
    my @sorted_by_color = sort @sortkeys;    # no sortsub!

    # 3. Decode
    @sorted_by_color = map { substr( $_, 6 ) } @sorted_by_color;

Links:

Research Paper from Uri Guttman and Larry Rosler:
http://www.sysarch.com/Perl/sort_paper.html
Sort Sort::Maker on cpan:
http://search.cpan.org/perldoc?Sort::Maker
Sort::External::Cookbook
Http://Search.cpan.org/perldoc?Sort::External::Cookbook

No comments: