06.19.09

Robustness

Posted in General development at 12:39 am by Kyoryu

 

Robustness is one of those things that we can chase forever.  Many developers think that “robustness” means never crashing.  A more experienced developer will realize that there are many, many things worse than crashing.  Continuing to run while in an invalid state is a much worse option, as it opens up the possibility of corrupted data – a far, far worse problem than a simple crash.

Even past that, we have to look at error conditions that can occur, what compensating actions we can take, and what the impact to the user is.

There seems to be a few general levels of robustness in applications.

  1. In cases where no system failure occurs, and all input data is correct, the system should work.  This is the basic level of correctness.  Now, the catch here is knowing what the system should do for any set of valid input…
  2. User input should be appropriately validated and sanitized to prevent failure.  Again, sometimes you can’t just nicely recover, and the only thing you can do is throw an exception or other error code.  That’s fine.
  3. The program should continue to work in case of reasonable system failures – a file being open unexpectedly, a remote system not being available.
  4. The program recovers in the case of extreme faillures – out of memory, full hard drive, hard drive unexpectedly removed.  In many cases, catching these failures may not be worth the effort.  It is unlikely that you can do any reasonable recovery, and so doing minimal recovery to try to not corrupt any data, and then get out.  If you don’t know that you can even do minimal recovery, just fail and hope for the best.
  5. In the case of users undermining the system by deleting files you require, I don’t know that it’s even worth bothering.  If something you require is gone, you’re broken.  Don’t even try to run, exit as quickly as you possibly can to prevent data loss in the future.  This scenario is no different than somebody deliberately deleting files from the Windows directory.

And, that’s my view.  I’m sure some will disagree, but that’s fine.  Attempting to recover from an unrecoverable scenario that is unlikely to ever happen in reality, and if it does, will almost certainly be accompanied with other failures has little value.  It is likely that the time spent could be spent doing other things that will have a higher value to your consumers.

06.12.09

How difficult is it to change your code?

Posted in Uncategorized at 2:43 pm by Kyoryu

Change happens.  We’re not perfect.  We don’t know everything.  We invariably learn something about the project we’re doing while we’re doing it.

So, code that can easily be changed is better than code that can’t be changed.  But how do we know how easy it is to change code?

Difficulty of changing code is best measured at the class/interface level.

If you have a lot of classes that are each easily changed, you will be able to change your code easily.  If you have a few classes that are each difficult to change, it will be difficult to change your code.  Even if the total work done is the same.

The primary measure of difficulty of changing code is the number of consumers it has.

The more things that know about your code, the harder it is to change.  This is a reason why the universally-loved concept of “the one place that does all X’ almost always fails.

It is easier to change an interface that has one consumer than one with three consumers.  It is easier to change an interface with three consumers than one with twenty consumers.  And if you have one hundred consumers, forget about it.

By consumers, I do not necessarily refer to applications or individuals, rather I refer to classes that refer to any individual class.

It’s easier to change internal code than public-facing code.

This is a restatement of the first point.  Published, public-facing code has an infinite number of consumers, making it nearly impossible to change.

The more implementation details you expose, the harder it is to change your code.

Public-facing code should, as much as possible, not leak implementation details.  It should reflect the user-facing concepts that are being exposed, and not the implementation details of those concepts.

Not exposing raw types is a good way to do this as well.  If you have a user ID that’s an int right now, it may be somewhat painful to change it to a long later.  However, if you wrap the int in a UserID class, changing it to use a long or even a GUID will become much, much easier.

The more scenarios you support, the harder it is to change

If you support a large number of scenarios, it is almost inevitable that assumptions needed for one will spill over into others.

The more well-defined your code is, the easier it is to change

If your code has well-defined inputs and outputs, and doesn’t have side effects, it is much easier to change.  While the public entry points may remain difficult to change if they have many consumers, any internal details can be changed arbitrarily, and the correctness of the final code can be verified.

On the other hand, if the behavior of the code is not well-defined, then changing it can become extremely difficult, as consumers may be relying upon existing behavior that is either in error, undocumented (and so likely to change if you touch the implementation), or simply a side effect of the “real” work being done.

06.09.09

Checked Exceptions

Posted in Uncategorized at 11:57 pm by Kyoryu

Interview with Anders Hejlsberg

This seems to be somewhat of a controversial subject.

On the one hand, we have Java, which forces exceptions to be caught and potentially rethrown.  This is, certainly, something of a pain.

On the other hand, C# doesn’t require anything, and any method can potentially throw any kind of exception.

I can see the points on both sides.  Nothing is  uglier than a bunch of arbitrary try/catch statements in code that do nothing more than rethrow exceptions.  And just blindly swallowing exceptions is even worse.

On the other hand, not really knowing what a method might throw in C# can be really, really annoying at times.

Let’s start with versioning, because the issues are pretty easy to see there. Let’s say I create a method foo that declares it throws exceptions A, B, and C. In version two of foo, I want to add a bunch of features, and now foo might throw exception D. It is a breaking change for me to add D to the throws clause of that method, because existing caller of that method will almost certainly not handle that exception.

Well, that’s certainly reasonable.  But, I have to wonder if it’s the right answer?  If you add a bunch of functionality to a class, is it perhaps better to make some new ReallySpiffyFoo class that contains the new functionality, and leave the existing class as it is?

Then again, I’m not a huge fan of growing classes over time – in most cases, I believe you’re better off leaving a well-defined class as-is except for bug fixes, and putting new functionality into a new class (which might internally use the old one).

Now, each time you walk up the ladder of aggregation, you have this exponential hierarchy below you of exceptions you have to deal with. You end up having to declare 40 exceptions that you might throw. And once you aggregate that with another subsystem you’ve got 80 exceptions in your throws clause. It just balloons out of control.

Another reasonable point.  However, I’d tend to believe that in a case like that, you’ve got a bigger design issue at play.  Why in the world would a business object throw a FileNotFoundException or the like?  At most, it should throw something like a CouldNotLoadDataException.  The fact that the data was to be loaded from a file is completely irrelevant at that level.

I also suspect that Anders is looking at this mostly from the viewpoint of a language and framework developer.  As a framework developer, he expects code he writes to be called by other people, and they can certainly look up what exceptions are being thrown.  That’s reasonable.

However, if I’m using an interface as an extensibility point, it’s a slightly different story.  Now I’m importing someone else’s code into my application, and I have no idea of what it might throw when I call it.  If that’ doesn’t sound scary, I don’t know what would.  At this point my options are either let my app crash when I make any arbitrary call, or catch Exception directly.  Neither of those are, in my mind, really good solutions.

What I’d like to see is defined exceptions, but not necessarily checked exceptions.  I’d like to know what exceptions a method may throw, but I don’t want to necessarily be forced to catch them.  In my mind, the exceptions you throw are effectively part of your API, especially when looked at from role-based interfaces for extensibility rather than header-style interfaces.

If I define an operation in an interface, I’m basically saying that I expect to be able to make this call, with certain parameters, and get a certain type of result back.  As part of that, saying that I expect to throw (or will throw) certain exceptions is part of the definition of my API.

What I don’t see much value in is checked exceptions as in Java.  To me, there is absolutely no value in putting in boilerplate code to just rethrow exceptions that I’ve caught just to satisfy a compiler restriction.  But, knowing what exceptions can be thrown is, to me, extremely valuable.

05.14.09

“Big Design” Up Front vs. Big “Design Up Front”

Posted in General development at 1:14 pm by Kyoryu

Design is always a touchy subject.  There are those who believe everything should be designed out beforehand, and those who believe you should design as you go.

I’m pretty firmly in the latter camp, as I’ve never seen the former plan actually work.  But, there’s a few caveats to that.

I don’t believe in the idea of designing every class, method, and interface before you start coding.  Many of those details will become obvious as part of coding, or improvements will be found, and having to add committees and approval processes to making changes (especially if they’re internal only) seems like a really bad idea.  When I talk about big design up front, this is generally what I’m referring to – Big “Design Up Front.”

On the other hand, you need to do some level of design up front.  I think the XP folks call this the system metaphor.  You need to know the big pieces in your design, and what the general flow of the data is.  If it’s a distributed application, who connects to who?

Specific technologies don’t need to be a part of this conversation.  If you know that process A will send data to process B when the data is ready, then how that takes place is mostly an implementation issue.  The important decisions are things like whether A connects to B or vice versa (especially if distributed), and whether A pushes data or B pulls it via polling.  Even in a single process, where are your component boundaries, and are they really boundaries?  What’s your threading model?

These are the kinds of decisions that you have to make early, as they shape the system as a whole.  These are the big pieces of design that need to be hammered out.  I’d call this “Big Design” Up Front, and I’m firmly in favor of it.

04.06.09

Letting go

Posted in General development at 10:12 pm by Kyoryu

One of the hardest things in development is learning to let go.  It’s something most developers fall victim to – you get some idea for a system that will simply fix ALL of the problems, walk the car, and wash the dog!

And then you find some use case that your system doesn’t quite cover.  So you fudge around the use case, or scope it out. And so on and so forth.  And you end up with some nasty piece of code that barely works, is horribly mangled to the point of unmaintainability, and that nobody wants to deal with, ever.

The problem here is letting go.  As developers, we are in the job of creating solutions for problems.  Any piece of code is a solution to a problem.  And most developers are pretty smart, and hate admitting that they’re wrong.

But sometimes we are wrong.  And when our use cases (the problem) start conflicting with our code (the solution), it should be the code that loses.  We should tailor our solutions to the problems we are presented, not the other way around.

When we’re wrong, we have to let go of our wrong solution, and learn to do it quickly and easily and without ego.  And that can be very hard.

01.29.09

Can you know this?

Posted in Uncategorized at 2:10 pm by Kyoryu

One of the questions I like to ask when designing software is simple – “can I know this?”

For instance, when dealing with data across a network, you might decide that you need to know that your local copy of the data is up-to-date before you reduce a particular value by 5.  So the question, in this case, is “can I know that my data is up to date?”

And the answer for that is, generally, no.  To do so requires implementing some sort of locking mechanism, and ask the database folks how easy that is.  Hint:  It’s easy in the trivial case, but quickly becomes difficult.  Another hint:  Two words – ‘deadlock’ and ‘livelock.’

As developers, we tend to believe that we can know everything about a system.  We tend to believe that every problem is, essentially, solvable.  We hate admitting that we can’t get the answer to something.

But sometimes, we just can’t.  Sometimes, the answer to something is dependent on so many other factors that are outside of our control, and that we can’t measure, that there is no way to answer the question with 100% accuracy.

When faced with problems like this, I try to follow up the initial question of “can we know this” with another couple of questisons:  “What do we know,” “what don’t we know,” ”who knows this,” and “what is it we really want to know or do?” This will often suggest a better solution to the problem than one which requires unknowable information.

For instance, in our initial example, we don’t know if our data is up to date, because we don’t know if someone else has updated the data since our last refresh.  And we can’t know that, because it will take an amount of time for any updates to reach us – the best we can ever do is say that we know what the data looked like at some point in the past.

But, what we do know is different.  We do know that we want to decrease the value by 5.  And we know that in most cases, there’s an authoritative data source somewhere.  This suggests a solution – instead of us modifying the data locally, send a request to the source of the data not to set the value to a specific amount (what we believe the current value is, minus 5), but rather to decrease the amount by 5.  Because the data source should always have the current value, it will know what to do to decrease the current value by 5.

If we don’t ask these questions, we can easily start down the road of trying to know the unknowable – being so set that we’re going to have our local machine get the latest value and set the new value to that minus 5 that we do all sorts of crazy research into synchronization and locking mechanisms.  Generally, this results in madness.  Every solution that covers some percentage of cases leaves others broken, and you can end up chasing your tail trying to patch the corner cases, or dealing with issues that are only there in the first place because of how you’re dealing with the problem – for instance, locking issues like I discussed earlier.

Occam’s Razor

Posted in General development at 1:45 pm by Kyoryu

While most people quote Occam’s Razor as “the simplest thing is most likely correct,” the actual quote is “do not multiply entities needlessly.”

I’m not sure that this is good programming advice.  I do, however, think it’s an accurate description of how most developers behave.  Typically, a developer will create as few discrete entities as possible.  They will use one class rather than two.  They will create a single large interface rather than multiple small interfaces.  They will create a single large function rather than break it down into multiple, smaller functions.

It doesn’t seem to be a matter of typing, or of saving characters.  The behavior seems to suggest that developers will prefer a single, very large method to two smaller methods, even if the total lines of code is fewer using two smaller methods.

This probably boils down to perceived overhead – creating an ‘x’ may be perceived as managerial type overhead, as opposed to “lines of code” which are real work.  If so, it would suggest that the less overhead that’s required to create an entity, the more likely it is that multiple entities will be created.

This is something to keep in mind when designing APIs, user experiences, languages, or other tools that you expect others to use.

01.13.09

These things are not the same

Posted in Uncategorized at 6:55 pm by Kyoryu

“I can do X with Y.”

“X is Y.”

That’s all.

12.03.08

Interfaces vs. Classes

Posted in Uncategorized at 8:01 pm by Kyoryu

As a follow-up to the last post, consider this:

Classes map to nouns.  They say what something is (a Person, for instance).

Interfaces map more closely to adjectives – they describe the noun (specifically, what the noun can do).

So, you wouldn’t have the class Person implement IPerson.  That’s basically saying that a person is person-y, which doesn’t make a lot of sense.

A person might implement IFeedable, IHirable, etc.

One thing that occurs to me as I write this post is that interfaces don’t describe what their nouns can do so much as they describe what can be done to the nouns.  This gets into the idea I have that most languages are very good at describing incoming messages to an object, but don’t do a very good job at describing outgoing messages.  But, that’s a topic for another day.

If this is true, then interfaces and classes serve very different purposes.  An interface shouldn’t be simply seen as the ultimate abstract base class.

If an interface is used to describe what can be done to a class (a Person might be IHirable), then in many cases, a class may implement multiple interfaces.  If that’s the case, then each interface should consist only of the actions that are atomic with that concept.  Actions that only make sense together, and don’t ever make sense without each other.

For instance, IHirable obviously needs some sort of Hire() method.  And, it might make sense for it to have a Fire() method.  It doesn’t make sense (legal reasons and internal politics aside) to be able to hire someone without being able to fire them.  And, similarly, you can’t fire someone that wasn’t able to be hired in the first place.  So, Fire() should go in IHirable, as it’s tightly coupled to Hire().

What about Pay()?  It doesn’t make sense that you can hire someone and not pay them, so it might make sense to put it in IHirable.  But, there are a lot of people that you pay that you don’t hire, and in fact, can’t hire.  The pizza delivery guy, the water company, etc.  So, maybe Pay just belongs in IPayable, which IHirable can extend.

Here’s a statement I’ll put forth, but I’m not backing 100% yet:

Having an interface with the same name as a class is a design smell.

 

10.10.08

Interfaces – necessary, but not sufficient

Posted in General development at 5:41 am by Kyoryu

Using an interface seems like one of those rules.  Everybody knows they should do it, because it makes your code more abstracted and… stuff.

However, if you have a gigantic class with a ton of methods, or methods that are very specific to its implementation, then simply providing an interface that mirrors the public methods of the class is of little value.  To successfully swap one implementation for another, you would need to understand the behavior of the first implementation so well that you could accurately mimic it – and, frankly, that’s not very likely unless you’re the one to write the first implementation anyway (and probably not too likely even then).  While you’ve avoided implementation coupling, you’ve got a kind of conceptual coupling in its place.  This is doubly true if the interface specifies that it returns objects that implement another interface (likely as thick as the first).

So, using interfaces as kind of headers doesn’t really help us too much in this case.  We’re still realistically tied to an implementation, and now we have interface versioning issues to deal with (which, for C# at least, are worse than class versioning issues, as adding a member to a class does not break backwards compatability – but it does for an interface).

That doesn’t mean that I’m against interfaces.  In fact, I love interfaces.  I just think that there’s better ways to use them than as sim-headers.

Interfaces should be used to define questions that, as the class you’re writing, you want to ask of your dependencies.  This is a bit of an inversion – typically, interfaces are defined from the POV of the class implementing them, not the class using them.  But, by controlling the interfaces you use, there’s less chance of them breaking and causing major headaches throughout your codebase.

Interfaces should also be as small as possible, and represent a single aspect of what you can do with an object.  IEnumerable<> is a great example – it only lists things that you need to do to enumerate a collection.  And because of that, it can be very stable.  The more things an interface does, the more likely it is to need to change, and the more code that will be broken when it does.

So, if you’re using interfaces in this way, how do you implement them?  Especially if the class you’re dealing with didn’t own the interface to begin with?  This isn’t too hard – write a small adapter class that implements the interface, and calls the underlying object in the appropriate way.  This has the added advantage of keeping all your dependent code in one spot, making it much, much easier to fix if the dependency ever changes underneath you (assuming that you don’t control it).

« Previous entries Next Page » Next Page »