06.19.09

Robustness

Posted in General development at 12:39 am by Kyoryu

 

Robustness is one of those things that we can chase forever.  Many developers think that “robustness” means never crashing.  A more experienced developer will realize that there are many, many things worse than crashing.  Continuing to run while in an invalid state is a much worse option, as it opens up the possibility of corrupted data – a far, far worse problem than a simple crash.

Even past that, we have to look at error conditions that can occur, what compensating actions we can take, and what the impact to the user is.

There seems to be a few general levels of robustness in applications.

  1. In cases where no system failure occurs, and all input data is correct, the system should work.  This is the basic level of correctness.  Now, the catch here is knowing what the system should do for any set of valid input…
  2. User input should be appropriately validated and sanitized to prevent failure.  Again, sometimes you can’t just nicely recover, and the only thing you can do is throw an exception or other error code.  That’s fine.
  3. The program should continue to work in case of reasonable system failures – a file being open unexpectedly, a remote system not being available.
  4. The program recovers in the case of extreme faillures – out of memory, full hard drive, hard drive unexpectedly removed.  In many cases, catching these failures may not be worth the effort.  It is unlikely that you can do any reasonable recovery, and so doing minimal recovery to try to not corrupt any data, and then get out.  If you don’t know that you can even do minimal recovery, just fail and hope for the best.
  5. In the case of users undermining the system by deleting files you require, I don’t know that it’s even worth bothering.  If something you require is gone, you’re broken.  Don’t even try to run, exit as quickly as you possibly can to prevent data loss in the future.  This scenario is no different than somebody deliberately deleting files from the Windows directory.

And, that’s my view.  I’m sure some will disagree, but that’s fine.  Attempting to recover from an unrecoverable scenario that is unlikely to ever happen in reality, and if it does, will almost certainly be accompanied with other failures has little value.  It is likely that the time spent could be spent doing other things that will have a higher value to your consumers.

Leave a Comment

You must be logged in to post a comment.