Daily Archives: April 26, 2012

Crushing Code and Shaving Chars

Who has heard of Code Golf? I hadn’t up until about 3 years ago when I first stumbled across some on Stack Overflow. I faintly recall the horror as I gazed upon some of the most cryptic code ever – somewhat akin to the feeling I had the first time I was shown a regular expression.

According to Wikipedia, “Code golf is a type of recreational computer programming competition in which participants strive to achieve the shortest possible code that implements a certain algorithm.”

Now I would never claim to be able to hold my own against some of the veteran golfers, however I do find myself growing more and more interested in how I can shave off a few bytes here and there. Such was the case today when I set out to answer a question on Stack Overflow.

The OP was asking how he might take a single-dimension array of words, sort them alphabetically, and then print them out. Each new group of words, categorized by their first letter, should be┬ápreceded┬áby that letter. So you would first see “A”, followed by all “a” words, then “B”, followed by all “b” words, and so on. I’ve personally wanted to do this in the past myself.

After a few minutes I had posted the following:

$firstChar = ''; 
sort( $aTest );

foreach ( $aTest as $word ) {
  $firstChar = substr( $word, 0, 1 );
  if ( $firstChar != $currChar ) {
    $currChar = $firstChar;
    echo $currChar;
  }
  echo $word; 
}

This is ugly, but fairly straight-forward. You cycle through the array, printing out the first letter of the current word whenever it differs from the last iteration. Simple enough. But then one of the members pointed out via comment that instead of writing

substr( $word, 0, 1 );

I could just access the first letter as though the word itself were an array:

sort( $aTest );

foreach ( $aTest as $word ) {
  if ( $word[0] != $currChar ) {
    $currChar = $word[0];
    echo $currChar;
  }
  echo $word; 
}

Now it had been a while since I did that. So I was pleased to be reminded of this. Then that little spark of brevity begin to grow, and I wanted to see where else I could shave off some chars. My eyes were then drawn to the assignment and echo portion:

$currChar = $word[0];
echo $currChar;

That would be shortened so that the assignment and echo take place all at once:

sort( $aTest );

foreach ( $aTest as $word ) {
  if ( $word[0] != $currChar ) {
    echo ( $currChar = $word[0] );
  }
  echo $word; 
}

So now it’s down to a single if statement, and a trailing echo. I wondered if I could just load this up into a ternary, so I set my hand to do that:

sort( $aTest );

foreach ( $aTest as $word ) {
  echo ( $word[0] != $currChar ) 
    ? ( $currChar = $word[0] ) . $word 
    : $word; 
}

At this point, I was getting a little OCD about my second and third expression. They seemed unbalanced. The second one was setting the new current category letter, as well as echoing out the word. The third one was just echoing out the word. This is when my mind returned to something Brendan Eich called an “abusage” – using && in somewhat of an if-statement fashion.

$cond && ($foo = 'bar');

The second part is only evaluated when the first part is determined to be true. If the first part is false, the operation short-circuits and the second part is never touched. I realized I could use this to test the equality of the current category letter and the first char of the current word. If they were not equal, I could use the right hand side of the operator to record the new letter:

sort( $aTest );

foreach ( $aTest as $word ) {
  echo ( $word[0] != $currChar ) && ( $currChar = $word[0] ) 
    ? $currChar . $word 
    : $word; 
}

Awesome. Looking very slim. Now that I have only one statement within my foreach loop, I can strip away the curly braces. I can also reduce even more space by moving away from long variables names. Instead of “word”, I can simply use “w”. Instead of “currChar”, I can use “l” (for “letter”). I can also collapse my ternary operator onto one line:

sort( $aTest );

foreach ( $aTest as $w ) echo ( $w[0] != $l ) && ( $l = $w[0] ) ? $l . $w : $w ;

Nearly there. Last we can strip out all of the white space possible. Unfortunately with the syntax of the foreach, we cannot remove the spaces around the keyword “as” in the condition.

sort($aTest);foreach($aTest as $w)echo($w[0]!=$l)&&($l=$w[0])?$l.$w:$w;

In the end we’ve reduced about 195 characters and 11 lines down to 1 line and roughly 70 characters! Granted, that’s nothing compared to what others have acheived, however it’s awesome to see this solution iteratively collapse before my very eyes.

With some clever substring alternatives, ternary operators, and hacking the way logical operators work, we’re able to reduce a dandy solution into a horrific and cryptic trace of insanity. I don’t suggest you actually produce code like this when you ship (outside of the obvious minification), as it’s far better to write readable code than it is to write short code.

In the end, I was able to switch away from the original array declaration and use the deprecated split function to save even more bytes:

$a=split(",","apple,pear,banana,kiwi,pineapple,strawberry");
sort($a);foreach($a as $w)echo($w[0]!=$l)&&($l=$w[0])?$l.$w:$w;

So how about you? Do you engage in this type of behavior? Have any tricks of the trade to share? Place them in the comments below! Or perhaps you can shorten my code even further – I’d love to see what you come up with!