Cleaning up Old Code

Published by marco on

Once you’ve been coding for a while, you’ll probably have quite a pile of code that you’ve written and are regularly using. It’s possible that you’ve got some older code in use that just works and on which you rely every day. At some point, though, you realize that you have to get back in there and fix a few things. That happened recently with the upgrade of the earthli WebCore and attendant applications from PHP4 to PHP5 (which is ongoing). The earthli codebase was born in 1999 and was originally designed to run on PHP3. It has been quite aggresively upgraded and rewritten since then and is thus in pretty decent condition, from the design and stability side of things.

The code formatting, however, was old-style and broke a few cardinal rules I’d picked up since 1999. There were two problems in particular that I wanted to address:

  1. Single line statements within conditionals or loops generally lacked braces, which is:
    1. potentially dangerous
    2. difficult to read
    3. confusing for the Zend debugger
  2. Many final return statements in functions or methods were nested within a useless else statement, e.g.:
    if ($something_is_true)
      return true;
    else
      return false;

Instead of just living with it, however, I did some global search/replace with regular expressions kung fu to get the code back up to snuff. I used the PCRE support in Zend Studio build 20090119, which I assume just uses the standard Eclipse search/replace support. All operations were applied solution-wide with relatively little trouble.

First, I searched for if-statements whose contents did not start with an opening curly bracket:

Search: ([ ]+)if \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1if (\2)\n\1{\n\1  \3\n\1}

From there, it’s relatively easy to find/replace all single-line else-statements, by searching for the following:

Search: ([ ]+)else\n[ ]+([^ {][^\n]+)
Replace: \1else\n\1{\n\1  \2\n\1}

Then, I did the more esoteric elseif:

Search: ([ ]+)elseif \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1elseif (\2)\n\1{\n\1  \3\n\1}

And finally, I replaced all loop constructs

Search: ([ ]+)foreach \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1foreach (\2)\n\1{\n\1  \3\n\1}
Search: ([ ]+)for \(([^\n]+)\)\n[ ]+([^ {][^\n]+)
Replace: \1for (\2)\n\1{\n\1  \3\n\1}

Once I’d normalized all of the else-statements, I could clean up else-statements that included only a return-statement.

Search: ([ ]+)else\n[ ]+\{\n[ ]+return ([^\n]+)\n[ ]+\}\n
Replace: \n\n\1return \2\n

There are a lot more interesting things you can to globally alter your code if you’re willing to put some time into building your regular expressions. Legibility is better, debugging works better and there are far fewer warning reported by the compiler.