Wednesday 23 May 2012

Phillips Scale of Code Quality



I created this scale of code quality as a bit of a joke many years ago. It is a generalization so not always applicable to the real world.  But you may find it useful for something.


Note that it is probably not a good idea to start comparing the code of different projects or different programmers using this scale.  Nevertheless I think it is useful for getting an idea of how the software you work with compares with others.  I have been working with C (as well as C++ and C#) for almost 30 years and the code I have worked with has varied from about a 1.5 to a 3.5.

At what level is your code on the scale?

0. Really Bad

I have only ever seen this sort of code written in BASIC and Fortran.  It is characterised by lots of goto statements (ie, spaghetti code). It can be impossible to understand even a small routine of 20 lines without many hours (or days) of study.


This sort of code can be correct (ie behaves according to specifications), so some people may not rate it too badly.  However I rate it really bad as first and foremost the code must be understandable, if anyone is to have a chance of modifying it.

1. Bad

The code shows that whoever wrote it does not understand the language and/or simple low-level design.  There are no comments.  Very basic things like correct indentation may not be followed.


At the high-level there has been little or no thought put into decomposing the problem.  Modules (if there are any) are tightly coupled.


This sort of software is characterised by many problems with reliability and maintainability.  For example, there will be run-time errors and crashes, "unreproduceable" bugs etc.  It is usually impossible to enhance the code without rewriting large sections.

2. Poor


Generally C code is better than this.  This is basically the standard of most university students after an introductory programming course.


At the low-level the code may be verbose, inefficient but understandable.  There are probably problems with reliability, maintainability, portability etc.  Cutting and pasting has resulted in duplicated code (which is bad for maintainability and other reasons).  There may be redundant and unreachable code which makes the code difficult to understand.


Some effort has been put into the design and separation into modules.  However, the choice of module boundaries is poor and the communication between modules is not well understood.  For old code the initial design will have been compromised over time.  Decoupling has been compromised with hidden interactions (such as use of global variables).


There are still some problems with reliability such as occasional crashes and inexplicable behaviour.


The software is difficult to maintain.  For example, minor changes may introduce unexpected bugs.  It is also very hard to predict how long even minor enhancements or bug fixes will take to implement since seemingly simple changes may require major redesign.

3. Average


Initially the overall design was good but the documentation on the design is lacking - either it is nonexistent, out of date, or hard to understand.


The code is probably fairly reliable, understandable, portable, reuseable etc, and correct in most cases.  At a high level the design is hard to understand and the system is difficult to maintain.


Generally there are some coding standards so that the code is consistent and understandable.  There would be comments describing each source file, global function etc (as prescribed by the coding standard).  At the low-level some variable declarations are commented and there may be comments throughout the code though they will often just reiterate what the code is doing rather than provide useful information.


Small code changes can be made easily as the low-level design is straightforward.  Bugs introduced by code modifications are usually caught by testing (as the code is readily testable) and assertions.


It is hard to make significant changes as the overall behaviour is difficult to comprehend.  This leads to code changes that are not done properly for expedience (ie fear of breaking something unexpectedly).  This leads to further brittleness of the system.  After many major enhancements the system will become as hard to maintain as the abovementioned "Poor" system.


4. Good

The overall design has decomposed the system into well-defined modules.  The interfaces between modules are simple and well understood.


The best coding practices are followed.  There are useful comments - for example, all modules and their interfaces are described, the purpose of all functions are commented, including parameters and return value, and difficult to understand code or algorithms are fully described.


The code uses many assertions which makes code changes less likely to introduce bugs.  Moreover, assertions help to document exactly what the code is doing by explicitly declaring assumptions.


This sort of project has little difficulty coping with small changes or even anticipated major enhancements as long as the standards are maintained.  There will probably be problems with significant unanticipated changes, which may contort the design causing subsequent problems.

5. Excellent

Finally, what I consider the ultimate level is the "Good" level but with the addition of automated regression testing of all modules (which also obviates the need for assertions).  This form of testing is nowadays colloquially known as "unit testing".


Unit tests should be performed whenever any code is changed to ensure any existing behaviour has not been broken.  Units tests should have complete code coverage (a coverage analysis tool can ensure this as well as detecting unreachable code), exercise all boundary conditions, and all code paths (if possible) including all error conditions.


Unit tests allow the code to be completely refactored without breaking existing behavior.  Hence major changes to the system (though taking some time) can be handled without compromising the existing behavior.


Wednesday 16 May 2012

Defensive Programming

I talked a bit about defensive programming in my February Post.  I also intended to add it as one of my fallacies in my article at Code Project, so I did a bit of research on what others have written about it.  I was very surprised at some inaccuracies I found, especially in the Wikipedia article.

Note that this post is not about how to use defensive programming.  That has been covered in depth (for example in Code Complete  by Steve McConnell).  I am just attempting to more accurately describe what it is, why it was invented, and problems with it.

What It Is?

Defensive programming or defensive coding is a style of writing computer software that attempts to be more resilient in the event of unexpected behavior.  This unexpected behavior is generally considered to be a result of existing bugs in the software but could be due to other problems such as corrupted data, hardware failures, or even bugs introduced by later software changes.  Generally, the code tries do do the most sensible thing with little or no performance penalty and without adding new error-conditions.

History

The first time I ever encountered the term "defensive programming" was in K&R (The C Programming Language, 1st Edition, by Kernighan and Ritchie).  After extensive searching I can find no earlier references to the term.  The term probably derives from the term "defensive driving" which came into common use in the early 1970's a few years before K&R was written.

It is mentioned twice in the index of K&R.  On page 53 it clearly refers to making code resilient to bugs, but on page 56 it talks about writing code in a way that reduces the likelihood of future code changes introducing bugs.  In any case many books since have used the term "defensive programming" to mean making the code resilient in the presence of bugs, for example The Pragmatic Programmer by Andrew Hunt and Dave Thomas (which talks about "coding defensively" in the chapter entitled "Pragmatic Paranoia"), and others.  Even before that many software professionals, myself included, have used the term in this way since at least the mid-1980's.

Disagreement About Definition

Despite the term being fairly clearly understood for more than 20 years the exact definition of the term has recently become blurred after several (generally non-peer-reviewed) articles and blogs have appeared on various web sites.  For example, the current Wikipedia article, and several sites that quote it, makes "defensive programming" sound like an approach to error-handling.   Error-handling can be related to defensive programming but they are definitely not the same thing; and one is not a subset of the other (see below).

Another, well-regarded and often referenced article, entitled simply Defensive Programming has a very high ranking on Code Project.  This is an excellent and worthy article in its own right, but it is not just about defensive programming.  By its own admission it is about "... techniques useful in catching programming errors ...".  As we will see below defensive programming has the opposite effect - it tends to hide errors not catch them.  This article discusses many things and should be more accurately called something like "Good Coding Practices".

Error Handling vs Defensive Programming

The distinction between error-handling and defensive programming is not very clear in the minds of many programmers. I will explain the difference.

Error-handling detects and handles situations where something goes wrong that you know is possible, however unlikely.  In contrast, defensive programming attempt to cater for problems that are assumed to be "impossible".  There are two problems with this distinction that can cause confusion.

The first problem is that it can depend on circumstances whether something is impossible or not.  For example, if a function is private to a module or program you may be able to ensure that it is always passed valid arguments; but if it is part of a public library you cannot be certain that it will never be passed bad data.  In the first case you can program defensively to ensure that the function does something sensible even though you know it is "impossible" that this will happen.  In the latter case you might add error-handling in case bad data is passed to the function.

So whether you choose to program defensively or add explicit error-handling depends on the scope of the software that you control.  I discuss this further below under Scope.

The second problem is that there can be borderline cases where it is debatable whether something should be considered impossible.  Consider this spectrum of scenarios for a hypothetical program that can be given invalid data:

  1. The program accepts data directly from the user and the user may enter invalid data.
  2. The programs accepts data from a text file that has been typed in by a person.
  3. The program accepts data from an XML file (machine or manually generated).
  4. The program reads a binary data file which was created by another program.
  5. The program reads a binary file that was written by itself.
  6. The program reads a binary file that  includes a CRC to check it has not been corrupted.
  7. The program reads a temporary binary file that it only created moments before.
  8. The program reads a memory mapped file that it created.
  9. The program reads from a local variable (ie, in memory) that it just wrote to.

At what point does invalid data become "impossible"?  Personally, I would say that it is "impossible" that a data file has become corrupted and still generates the same CRC (see scenario 6).  However, if security of the data is important you have to consider that the file was deliberately tampered with (in which case a cryptographic checksum such as SHA1 should be used).

However, I know that a lot of software assumes that binary data files are always valid (scenario 4 or 5).  Much software will behave erratically if binary data files have become corrupted.

Of course, I think anyone would agree that you have to assume that the value of a local variable you just wrote to (see scenario 9) cannot change.  However, even in that case a hardware error, deliberate tampering, or some other problem could change memory unexpectedly.

So it is not always clear when you need to have explicit error-handling code and when you should simply program defensively.

Example

The archetypal example of defensive programming occurs in just about every C program ever written, where the terminating condition is written as a test for inequality ( < ) rather than a test for non-equality ( != ).  For example, a typical loop is written like this:

  size_t len = strlen(str);
  for (i = 0; i < len; ++i)
      result += evaluate(str[i]);

rather than this:

  size_t len = strlen(str);
  for (i = 0; i != len; ++i)
      result += evaluate(str[i]);

Clearly both of these should do exactly the same thing since the variable 'i' is only ever incremented and can never skip having the same value as 'len'.  So then why are loop termination conditions always written in the first manner?

First, the consequences of the "impossible" condition are bad, probably resulting in all sorts of undesirable consequences in production software, such as an infinite loop or a memory access violation.  The "impossible" condition may occur for any number of reasons such as:
  • bad hardware or a stray gamma ray photon means that one of the bits of 'i' is flipped randomly
  • another errant process (in a system without hardware memory protection) or thread changes memory that does not belong to it
  • bad supervisor level code (ie, the operating system or a device driver) changes memory
  • the 'evaluate' function has a rogue pointer that changes the value of 'i'
  • the 'evaluate' function corrupts the stack frame pointer and the location of 'i' is now at some random place on the stack
  • later code changes introduce bugs, for example:
      for (i = 0; i != len; ++i)
      {
          while (!isprint(str[i]))           // bad code change means that 'i' may never be equal to 'len'
              ++i;
          result += evaluate(str[i]);
      }

Of course, the last few, caused by bugs in the software, are the most common, which is why defensive programming is usually associated with protecting against bugs.

Culture of C

There are also two other aspects of the C language that affect how and when defensive programming is used - namely the emphasis on efficiency and the approach to error handling.

Looking at efficiency first -- it is one of the fundamental premises of C that it assumes the programmer knows what they are doing.  The language does not protect from possible mistakes, as other languages try to do.  For example, it is easy to write past the end of an array in C - but if all array access had bounds checking applied (by the compiler) then it would run more slowly even for perfectly safe code.

Due to this emphasis on efficiency, defensive programming is only used when it has little or no performance penalty.  This is typified in the above example since a "less than" operation is normally just as fast as a "not equal" one.

The other aspect is the approach to error-handling in C.  Errors in C are generally handled by using error return values.  It is not unusual for C code to be dominated by error-handling, so error-conditions are ignored if they are considered unlikely to occur - eg, nobody ever checks the error return value from printf().  (In fact, error return values are often ignored when they should not be, but that is for another discussion.)

So, if "unlikely" errors are not generally handled it makes no sense for "impossible" conditions to be handled as errors since this would add to the existing error handling burden.  (This is covered in more detail in item 10 of my Code Project article at http://www.codeproject.com/Articles/357065/Ten-Fallacies-of-Good-C-Code.)  Of course, in languages with exception handling, many such "impossible" conditions can be easily handled by throwing a "software exception".

Scope

A lot of the confusion about defensive programming comes about because the scope of control is not always clearly defined.  For example, if you have a function that takes a string (const char *) parameter you may want to assume that you are never passed a NULL pointer if it never makes sense to do so.  If it is a private function you may be able to always ensure it; but if it's use is outside the scope of your control then you can't assume that unless you clearly document that a NULL pointer may not be used.

In any case even if you consider the condition to be impossible it is wise to allow for the possibility using defensive programming.  Many functions do this by simply returning if unexpectedly passed a NULL pointer.  (Again, note that this is different to error-handling since no error value is generated.)

So any discussion of defensive programming must clearly define the scope of the code being considered.  This is one problem with the Wikipedia article on defensive programming.

Symptoms

When using buggy software the symptoms of defensive programming are seen often (but may be dismissed as operator error).  I think everyone has at some time seen software that did something a little strange, like flash a window, ignore a command, or even display a message about an "unknown error".  Usually this is caused by a bug which caused a problem from which the software attempted to recover.

This recovery can sometimes be successful but usually results in the program limping along.  In the worst case it can silently cause massive problems like data loss or corruption.  (After seeing something like this, I generally save my data and restart the software to ensure it is not in some weird state.)

Problems with Defensive Programming

By now it must be pretty clear that defensive programming has a major problem.  It hides the presence of bugs.

Some people may think it is good to hide bugs.  Certainly, for released software in use, you don't want to force the user to deal with a problem that they do not understand.  On the other hand blindly continuing when something be broken can be dangerous.  Also, some attempt should be made to notify someone of the problem - at least write an error message to a log file.

What is worse, though, is that defensive coding has been known to hide bugs during development and testing.  Nobody can argue that this is a good thing.  The alternative is to use what has been called "offensive programming" and sometimes "fail fast".  This means to make sure someone knows about problems rather than hiding them.

I do use defensive programming so that unexpected or impossible situations are handled in release builds; but add assertions that check for the impossible situations so that bugs do not sneak through.  I also do most testing use the debug build (so that assertions are used), except for final acceptance testing.  For some critical things I also explicitly add error-handling code, since assertions are removed in release builds.

Standard C Library

Here are two more examples of how defensive programming is used, taken from the standard C library.
A nasty problem that occurs in far too many C programs is caused by buffer overruns.  This mostly happens when copying or building a string and the size of the the output buffer is exceeded.  In the name of defensive programming it is recommended to use string functions that take a buffer length (strncpy(), strncat(), snprintf(), etc).  This avoids the buffer overrun, but hides the (possible) problem that the string was truncated.

Reports often require data nicely formatted into columns.  This is usually achieved in C using the minimum field width of printf() format-specifiers.  For example,  to print numbers in a column five characters wide you would use the "%5d" specifier.  If the integer is too big for the field then C just prints the extra characters anyway even though this will ruin your columns.  (Contrast this with other languages like Fortran where field overflow results in silent truncation of numbers, which has caused some very nasty problems.)  This is an example of defensive programming since when presented with an unexpected situation the code tries to do something sensible.

Exercise

Finally, here is something for you to think about.  The standard C library includes a function that takes a string of digits and returns an integer called atoi.
If you are not familiar with atoi(), it does not return any error code but stops when it encounters the first unexpected character.  As an example atoi("two") just returns zero.

Is the behavior of atoi() an example of defensive programming? Why?

How could it be improved?