Skip to content

Code Readability Part 2, Code Structure

by Shawn Stratton on January 19th, 2009

After read­ing some of the com­ments from
Part 1 I’ve really been moti­vated to con­tinue this series (I’m sorry
for being late to approve com­ments.) While writ­ing com­ments can
really speak to the frame of mind, and the intent, at the time of
writ­ing it can only go so far to ensure that the reader and/or
main­tainer of your code can clearly read and under­stand the
cre­ativ­ity that you spent so long to write. I’ve had the plea­sure of
main­tain­ing a legacy appli­ca­tion devel­oped by peo­ple who were past
dead­line the sec­ond they had their assign­ment handed to them in the
past and it can get really inter­est­ing rather quickly when you see
how sloppy you can get when you are in such a hurry. Here are some
guide­lines I’ve given myself to ensure that the struc­ture is cor­rect
at the end of the day.

  • Break large files apart into
    smaller, more digestible pieces. This has become rather more
    impor­tant these days, I have sifted through a 20,000 line source
    file in the past, and there are some instances where grep just can’t
    help. Object Ori­ented code makes this sig­nif­i­cantly eas­ier as you
    can eas­ily write a small object that you can inter­act with and
    sep­a­rate it from your busi­ness logic. While I’m sure I could eas­ily
    come up with a sim­ple exam­ple of what I’m talk­ing about I think it
    would take up too much space here. Suf­fice it to say that gen­er­ally
    you should try to cut file length at about 1,000 lines, it makes
    search­ing a file a lit­tle more man­age­able and con­sum­ing said file
    more effi­cient for future devel­op­ers as our brains like things
    con­ve­niently cut down to size.

  • Using nam­ing con­ven­tions is an
    inte­gral part of mak­ing code eas­ier to read and write, for exam­ple,
    I ran into an error the other day where I wasn’t think­ing and called
    the class Trans­fer­ob­ject, I was able to quickly catch that error
    upon review­ing the source code as we use a Camel­Caps con­ven­tion for
    nam­ing our classes. There is rea­son for this mad­ness, we use
    autoload­ers (not nec­es­sar­ily the best idea for scal­a­bil­ity but it
    works for our pur­poses) as far as the Unix file sys­tem is con­cerned
    cap­i­tal­ize and low­er­case let­ters dif­fer sig­nif­i­cantly so call­ing
    Trans­fer­Ob­ject dif­fers from trans­fer­Ob­ject, trans­fer­ob­ject, and
    Trans­fer­ob­ject. A stan­dard will make sure that you can look at
    vari­able, class, inter­face, func­tion, etc names and see where an
    error might be.

  • While we are on Vari­able, Class,
    Func­tion, etc names, make sure your name, at least loosely,
    describes what it con­tains. I’ve seen count­less ref­er­ences in a
    past cowork­ers code that called vari­ables as $array, $arrayer,
    $arrayor, $a, $b, $c etc. The rea­son why you want to describe what
    you’re con­tain­ing should be obvi­ous but let me expand just a bit.
    Let’s say that I’m inter­act­ing with a func­tion that was pre­de­fined
    and I’m try­ing to under­stand what input it takes

    <?php
       function someFunction($a,$b) { … }
    ?>

    is not going to yield as
    much infor­ma­tion to me as perhaps

    <?php
      function someFunction($startDate, $endDate) { … }
    ?>

    or

    <?php
      function someFunction(Date $startDate, Date $endDate) { … }
    ?>

    would
    be. In the first func­tion all I can see is that this func­tion takes
    two argu­ments, much read­ing and digest­ing will occur now (pro­vided
    the writer didn’t strongly com­ment his func­tion which even then I
    have to go read the DocBlocks.) The sec­ond exam­ple is a bit more
    intu­itive, now I know I’m work­ing with a date range that starts with
    the first argu­ment and ends with the sec­ond; how­ever, I don’t know
    that I’m deal­ing with an Object (in this case PEAR Date) which is
    what exam­ple 3 con­veys, it type hints the argu­ments and as such also
    gives us a bit of error handling.

  • Struc­ture your direc­to­ries, I’m
    sure this one will bring ques­tions of “How do direc­to­ries inter­act
    with code read­abil­ity” well that’s an easy one to answer, if you
    sep­a­rate out your View, Mod­els, Con­trollers, Base Libraries,
    Tem­plates, etc it will be a lot eas­ier to find the spe­cific thing
    you’re look­ing for. MVC, MVP, and N-Tier does make this a lot eas­ier
    as it gives us bet­ter spec­i­fi­ca­tions of how to lay out the
    direc­to­ries and how to effec­tively use them.

  • Struc­ture your source, for each
    level of exe­cu­tion you should indent a uni­form amount of char­ac­ters
    (per­son­ally I go with a tab for every level.) It can be very painful
    to try to read

    <?php
    if (somecondition) {
    if (someothercondition)
    if (lastcondition) {
    ...
    } else {
    ...
    }
    } elseif (someothercondition2) {
    } else {
    ...
    }
    } else {
    }
    ?>

    rather than

    <?php
    if (somecondition) {
        if (someothercondition) {
            if (lastcondition) {
                ...
            } else {
                ...
            }
        } elseif (someothercondition2) {
            ...
        } else {
            ...
        }
    } else {
    ...
    }
    ?>

    inci­den­tally there is a error in the first exe­cu­tion block, see if you can find it.

  • While on the sub­ject on
    con­di­tional state­ments, I’ve seen Error Check­ing and Input
    Val­i­da­tion as a pet peeve for a long time. There are two major
    things I’ve noticed, most devel­op­ers I’ve worked with either
    com­pletely skip Input Val­i­da­tion and Error Check­ing or write these
    ugly and long conditonals:

    <?php
    /**
     * Recursive version of in_array
     *
     * @param string $needle
     * @param array $haystack
     * @param bool $strict
     * @return bool
     */
    function in_rarray($needle, $haystack, $strict = false) {
        //check to ensure that passed haystack is an array for recursion
        if (is_array ( $haystack )) {
            $array = new RecursiveIteratorIterator ( new RecursiveArrayIterator ( $haystack ) );
            foreach ( $array as $element ) {
                //if strict mode enabled use the strict comparison operate
                if ($strict == true) {
                    if ($element === $needle) {
                        return true;
                    }
                } else {
                    if ($element == $needle) {
                        return true;
                    }
                }
            }
            return false;
        } else {
            throw new exception ( 'Haystack is not an array' );
        }
    }
    ?>

    Where it could be writ­ten as:

    <?php
    /**
     * Recursive version of in_array
     *
     * @param string $needle
     * @param array $haystack
     * @param bool $strict
     * @return bool
     */
    function in_rarray($needle, $haystack, $strict = false) {
        //check to ensure that passed haystack is an array for recursion
        if (!is_array($haystack)) { throw new exception('Haystack is not an array'); }
        $array = new RecursiveIteratorIterator ( new RecursiveArrayIterator ( $haystack ) );
        foreach ( $array as $element ) {
            //if strict mode enabled use the strict comparison operate
            if ($strict == true) {
                if ($element === $needle) {
                    return true;
                }
            } else {
                if ($element == $needle) {
                    return true;
                }
            }
        }
        return false;
    }
    ?>

    The
    les­son to be learned here, write your flow of exe­cu­tion in such a
    way that it stops right away dur­ing a check should it fail rather
    than bas­ing exe­cu­tion on a con­di­tion and adding an unnec­es­sary level
    of complexity.

That con­cludes part 2 of my
mini-series, in part 3 I will be dis­cussing how to build decent user
require­ments and API Doc­u­men­ta­tion. I hope every­one is hav­ing as
much fun read­ing this as I am writ­ing it. I hope to do bet­ter with
com­ments in the near future and will be look­ing for a way to han­dle
spam block­ing with­out hav­ing to val­i­date every­ones comments.

From → PHP

2 Comments
  1. Regard­ing the “return early” mantra, which I’ve seen a lot in PEAR feed­back lately, I’m sort of torn…

    Granted, I’m a bit older-school than many, so the “only have ONE return point from a func­tion” is some­thing I have to con­sciously con­sider. On the other hand, I can also see how “return­ing early” based on con­di­tions not being met serves some code effi­ciency and poten­tially readability.

    To me, though, the read­abil­ity sav­ings depend on how com­plex the method itself is. In the case of a long method, (one that I’d prob­a­bly say is “too long” any­way), I’d more likely choose to keep the nested con­di­tion­als, sim­ply because I can see the direct cor­re­la­tion between the logic that needs that con­di­tion to be met, rather than check­ing the con­di­tion *all-by-itself* ear­lier in the method and return­ing with­out fur­ther expla­na­tion. By keep­ing the return-due-to-unmet-condition paired with the logic-for-the-met-condition, I keep the “why” for both pieces more vis­i­bly cou­pled. Con­sider an exam­ple method where you could put five return-due-to-unmet-condition oper­a­tions right at the begin­ning, where its logic-for-the-met-condition peers are scat­tered through­out the remain­ing 200 lines of the method. (now my head hurts from imag­in­ing a 200-line method)

    Now, in a prop­erly short method, read­abil­ity of this cou­pled condition-met/condition-notmet might not suf­fer much from decou­pling it into a stand­alone return-due-to-unmet-condition. In this instance, I’d say per­sonal pref­er­ence between the two schools of thought can be seen as effec­tively equal.

  2. In regards to your com­ment Chuck,
    the func­tion I used here was set to serve as an exam­ple and not really a work­ing one, I used a func­tion I had writ­ten ear­lier and added error check­ing to it. The thing about it is though, beyond tak­ing out mul­ti­ple lev­els of nest­ing and push­ing error check­ing to the front you can eas­ily see what con­di­tions a method has to meet in order to exe­cute eas­ily which is why I believe in it. I would rec­om­mend using it in a short method for no other rea­son than to get into the prac­tice of it, because more often than not things are not short and not clean.

Comments are closed.