01.29.09
Can you know this?
One of the questions I like to ask when designing software is simple – “can I know this?”
For instance, when dealing with data across a network, you might decide that you need to know that your local copy of the data is up-to-date before you reduce a particular value by 5. So the question, in this case, is “can I know that my data is up to date?”
And the answer for that is, generally, no. To do so requires implementing some sort of locking mechanism, and ask the database folks how easy that is. Hint: It’s easy in the trivial case, but quickly becomes difficult. Another hint: Two words – ‘deadlock’ and ‘livelock.’
As developers, we tend to believe that we can know everything about a system. We tend to believe that every problem is, essentially, solvable. We hate admitting that we can’t get the answer to something.
But sometimes, we just can’t. Sometimes, the answer to something is dependent on so many other factors that are outside of our control, and that we can’t measure, that there is no way to answer the question with 100% accuracy.
When faced with problems like this, I try to follow up the initial question of “can we know this” with another couple of questisons: “What do we know,” “what don’t we know,” ”who knows this,” and “what is it we really want to know or do?” This will often suggest a better solution to the problem than one which requires unknowable information.
For instance, in our initial example, we don’t know if our data is up to date, because we don’t know if someone else has updated the data since our last refresh. And we can’t know that, because it will take an amount of time for any updates to reach us – the best we can ever do is say that we know what the data looked like at some point in the past.
But, what we do know is different. We do know that we want to decrease the value by 5. And we know that in most cases, there’s an authoritative data source somewhere. This suggests a solution – instead of us modifying the data locally, send a request to the source of the data not to set the value to a specific amount (what we believe the current value is, minus 5), but rather to decrease the amount by 5. Because the data source should always have the current value, it will know what to do to decrease the current value by 5.
If we don’t ask these questions, we can easily start down the road of trying to know the unknowable – being so set that we’re going to have our local machine get the latest value and set the new value to that minus 5 that we do all sorts of crazy research into synchronization and locking mechanisms. Generally, this results in madness. Every solution that covers some percentage of cases leaves others broken, and you can end up chasing your tail trying to patch the corner cases, or dealing with issues that are only there in the first place because of how you’re dealing with the problem – for instance, locking issues like I discussed earlier.