01.29.09
Posted in General development at 1:45 pm by Kyoryu
While most people quote Occam’s Razor as “the simplest thing is most likely correct,” the actual quote is “do not multiply entities needlessly.”
I’m not sure that this is good programming advice. I do, however, think it’s an accurate description of how most developers behave. Typically, a developer will create as few discrete entities as possible. They will use one class rather than two. They will create a single large interface rather than multiple small interfaces. They will create a single large function rather than break it down into multiple, smaller functions.
It doesn’t seem to be a matter of typing, or of saving characters. The behavior seems to suggest that developers will prefer a single, very large method to two smaller methods, even if the total lines of code is fewer using two smaller methods.
This probably boils down to perceived overhead – creating an ‘x’ may be perceived as managerial type overhead, as opposed to “lines of code” which are real work. If so, it would suggest that the less overhead that’s required to create an entity, the more likely it is that multiple entities will be created.
This is something to keep in mind when designing APIs, user experiences, languages, or other tools that you expect others to use.
Permalink
01.13.09
Posted in Uncategorized at 6:55 pm by Kyoryu
“I can do X with Y.”
“X is Y.”
That’s all.
Permalink
12.03.08
Posted in Uncategorized at 8:01 pm by Kyoryu
As a follow-up to the last post, consider this:
Classes map to nouns. They say what something is (a Person, for instance).
Interfaces map more closely to adjectives – they describe the noun (specifically, what the noun can do).
So, you wouldn’t have the class Person implement IPerson. That’s basically saying that a person is person-y, which doesn’t make a lot of sense.
A person might implement IFeedable, IHirable, etc.
One thing that occurs to me as I write this post is that interfaces don’t describe what their nouns can do so much as they describe what can be done to the nouns. This gets into the idea I have that most languages are very good at describing incoming messages to an object, but don’t do a very good job at describing outgoing messages. But, that’s a topic for another day.
If this is true, then interfaces and classes serve very different purposes. An interface shouldn’t be simply seen as the ultimate abstract base class.
If an interface is used to describe what can be done to a class (a Person might be IHirable), then in many cases, a class may implement multiple interfaces. If that’s the case, then each interface should consist only of the actions that are atomic with that concept. Actions that only make sense together, and don’t ever make sense without each other.
For instance, IHirable obviously needs some sort of Hire() method. And, it might make sense for it to have a Fire() method. It doesn’t make sense (legal reasons and internal politics aside) to be able to hire someone without being able to fire them. And, similarly, you can’t fire someone that wasn’t able to be hired in the first place. So, Fire() should go in IHirable, as it’s tightly coupled to Hire().
What about Pay()? It doesn’t make sense that you can hire someone and not pay them, so it might make sense to put it in IHirable. But, there are a lot of people that you pay that you don’t hire, and in fact, can’t hire. The pizza delivery guy, the water company, etc. So, maybe Pay just belongs in IPayable, which IHirable can extend.
Here’s a statement I’ll put forth, but I’m not backing 100% yet:
Having an interface with the same name as a class is a design smell.
Permalink
10.10.08
Posted in General development at 5:41 am by Kyoryu
Using an interface seems like one of those rules. Everybody knows they should do it, because it makes your code more abstracted and… stuff.
However, if you have a gigantic class with a ton of methods, or methods that are very specific to its implementation, then simply providing an interface that mirrors the public methods of the class is of little value. To successfully swap one implementation for another, you would need to understand the behavior of the first implementation so well that you could accurately mimic it – and, frankly, that’s not very likely unless you’re the one to write the first implementation anyway (and probably not too likely even then). While you’ve avoided implementation coupling, you’ve got a kind of conceptual coupling in its place. This is doubly true if the interface specifies that it returns objects that implement another interface (likely as thick as the first).
So, using interfaces as kind of headers doesn’t really help us too much in this case. We’re still realistically tied to an implementation, and now we have interface versioning issues to deal with (which, for C# at least, are worse than class versioning issues, as adding a member to a class does not break backwards compatability – but it does for an interface).
That doesn’t mean that I’m against interfaces. In fact, I love interfaces. I just think that there’s better ways to use them than as sim-headers.
Interfaces should be used to define questions that, as the class you’re writing, you want to ask of your dependencies. This is a bit of an inversion – typically, interfaces are defined from the POV of the class implementing them, not the class using them. But, by controlling the interfaces you use, there’s less chance of them breaking and causing major headaches throughout your codebase.
Interfaces should also be as small as possible, and represent a single aspect of what you can do with an object. IEnumerable<> is a great example – it only lists things that you need to do to enumerate a collection. And because of that, it can be very stable. The more things an interface does, the more likely it is to need to change, and the more code that will be broken when it does.
So, if you’re using interfaces in this way, how do you implement them? Especially if the class you’re dealing with didn’t own the interface to begin with? This isn’t too hard – write a small adapter class that implements the interface, and calls the underlying object in the appropriate way. This has the added advantage of keeping all your dependent code in one spot, making it much, much easier to fix if the dependency ever changes underneath you (assuming that you don’t control it).
Permalink
06.20.08
Posted in General development at 5:51 pm by Kyoryu
“You can’t turn a pig’s ear into a silk purse.”
“It’s not a global, it’s a singleton!”
Description:
An “acceptable” design pattern is placed on top of a concept that is generally avoided
Symptoms:
- Problems associated with a known poor development practice start cropping up.
- Problem areas are defended by spouting off the name of the design pattern that they superficially resemble.
- Patterns are used when the problem that the pattern solves is not demonstrated, but for tangental reasons.
Examples:
The most common form of silk-pursing is singletonitis. Too often, globals are wrapped up in singletons, because that somehow makes them “okay.” Inappropriate use of Service Providers seems to be the next version of singletonitis, in that it is often used to propagate globals rather than the actual purposes (extensibility, etc.)
Usage of try/catch/finally to simulate goto is another example.
Silk-pursing is related to cargo-cult-programming. In both cases, a useful pattern is used inappropriately. In cargo-cult-programming usually involves adding patterns/structures/algorithms/etc. for no apparent reason whatsoever. Silk-pursing is different in that the pattern being abused is applied solely for the purpose of hiding a practice that is frowned upon.
Silk-pursing may seem like gold-plating, but it is different. Gold-plating involves putting extra, unnecessary features into code. Silk-pursing simply hides poor development practices.
Silk-pursing may or may not be a deliberate attempt to conceal. In many cases, developers will actively believe that because they are using something they’ve heard is a beneficial practice, that what they are doing is actually better.
Fixing:
Treat the code as if it were the underlying development practice – treat silk-purse singletons as globals, etc.
Educate developers on the purpose of the design patterns that are being abused.
Educate developers on the fact that concealing a bad practice doesn’t make it any better.
Educate developers on ways to design code that doesn’t involve using the poor practice.
Permalink
05.23.08
Posted in General development at 3:04 pm by Kyoryu
Dependencies and coupling seem to cause the greatest pain in software development. I think it’s useful to look at the types of dependencies that can exist.
Contained Dependencies
A contained dependency is a dependency that a class has, but which is not communicated externally. The class uses the contained object, and so is dependent upon it, but does not propagate the dependency. This is the most benign dependency, as if the dependency breaks, the class may break, but it can not (directly) break other objects.
Direct Dependency
A direct dependency is a dependency which is exposed directly by the class, either in a return value, a parameter, or a base class. Direct dependencies are worse than contained dependencies, as they can directly break classes that use the class under discussion, both in terms of compilation and functionality.
Indirect Dependency
An indirect dependency is a dependency exposed by another dependency. This is worse than direct dependencies, as this is how dependencies propagate, causing the system to become brittle.
Hidden Dependency
A hidden dependency is arguably the worst kind of dependency. A hidden dependency is a contained dependency that can cause side effects, causing other components to fail. Globals are, generally, hidden dependencies.
Permalink
05.14.08
Posted in Uncategorized at 1:04 pm by Kyoryu
One of the things that’s been floating around in my mind is the difference between object types and value types. Conceptually, it’s the difference between a dialog and an integer or string. In my mind, I understand the intent, but I dislike categorizing by intent, as that leads to fuzzy definitions that eventually become meaningless.
The other way to define value types is by storage – stack vs. heap. I really don’t like this, though, as one of the advantages of value types is that it’s safe to pass them by reference. I would say a string in C# is a value type, even though it’s a class (and therefore passed by reference).
So, here’s a set of criteria to determine if something is a value type or not. These criteria don’t specify whether a particular type could be implemented as a value type, only whether or not they are a value type as implemented.
- The type must be immutable. Any methods called on the type cannot change its members. If a modification is desired, the method should create a new object of the same type, and return that object.
- The type must only reference other value types.
If your type follows these two rules, it is guaranteed to be side-effect free.
Rule #2 might be relaxed, if the reference type that the type holds is guaranteed to never be exposed (so nothing can ever modify it), and the object never modifies it itself.
Edit a year later: Another exemption is that a type can be a value type if it never acts upon any reference types that it contains. A list can be a value tpe, then, as it doesn’t actually act on any of its contents.
Permalink
04.26.08
Posted in General development at 3:11 pm by Kyoryu
Ahhh, probably the most controversial subject in development. I don’t know of any single issue that is more likely to get people riled up, either for or against it.
My experience is pretty simple. I’ve never done pair programming “full-time.” But, like most programmers, I’ve done it at times – working with another programmer over a problem.
When I’ve done that, I’ve generally found that a few things happened:
- My knowledge increased
- Hopefully, the knowledge of the other guy increased
- We both understood the system better
- We generally produced better quality code than I was used to seeing from either of us, individually
- We remained more highly focused
There’s been a number of studies done on pair programming. Most of them use similar methodologies, and reach similar conclusions. I’ll dig up references later.
In general, they take developers or students, and divide them into two groups. One of the groups will pair up and work on a project, while the other group will approach the project individually.
In general, these projects are rather small. In one experiment, they were class assignments, over a period of time.
The results generally found were that the pair took less clock-time, but more man-hours to complete the task. Generally, the results from pairs were of higher quality.
One particular study continued the experiment over time, and found that the tax on man-hours dropped from 40% at the beginning, to only about 15% at the end of the experiment.
Now, that’s pretty impressive by itself. If a project would normally take 40 hours for a single developer, and with two, you can get it done in 22 hours and with higher quality, I think that’s a win.
But, I think that the experiments described are testing the wrong things.
I see the real benefit of pair programming not in coming from initial productivity, but from the ability to sustain productivity over time, and to allow higher levels of scaling.
One effect noted was that for pairs to become highly effective, it took some amount of time for both the developers to get used to pair programming, as well as to get used to each other. When doing an experiment on a small or micro basis (projects taking <1 day to complete), the initial cost of this can easily outweight any benefits.
Secondly, one advantage that I see is increased knowledge of systems. By pairing, especially if pairs are not static, developers will work on multiple areas of the project, and will gain understanding of the “big picture” of what they’re doing, rather than their isolated area. This should increase overall quality, as well as remove the “hit-by-a-bus” category of risks. When working on small projects that can be completed in under a week of work, this is mostly irrelevant.
Third, by pairing developers, you will remove some level of the communication tax. Given four developers, if they work individually, they must all coordinate. If they are paired off, you now only have two groups that need to coordinate, instead of 4 individuals. Because the “individual” developers worked as exactly that, they had zero communication tax. Comparing paired developers to two developers collaborating as individuals would be an interesting metric.
Fourth, maintenance is very important. While “passing tests” is an important measure of quality, the ability to maintain and expand code is also extremely important, if hard to measure. If pairing increases quality, then this additional quality should (in theory) allow code to be more easily changed in the future – especially when a larger number of developers have insight into it. Again, on small projects, this benefit is unimportant.
None of these benefits can be measured on toy or small projects.
So, how would I design a pair programming test, in an ideal world?
I think you’d only need to change a few things.
First, projects need to be somewhat longer – at least 40 hours to completion. And that is minimum.
Secondly, to be fair, the test needs to compare equal-sized groups of developers to each other, working individually or in pairs. In my experience, I’m not sure that adding a single additional developer actually increases productivity, but this is a more realistic comparison.
And last, ideally you’d want to scale out even further – four developers or more, working in pairs, against a team of the same size working as individuals. Generally, in a realistic environment, the question is not “should we hire twice as many developers and have them pair?” The typical question is “we have this many developers – should they work alone, or in pairs?”
I honestly don’t know how these results would turn out. But, I suspect that they’d turn out very well for pair programming, especially if combined with other practices (such as TDD).
Permalink
Posted in General development at 1:43 am by Kyoryu
In skiing, there’s a concept called the “fall line.” The fall line just means down. But it doesn’t mean “towards the bottom of the hill,” it just means what is immediately down from the exact location you’re at – if you dropped a ball, or poured some water, which way would it go?
This is important in skiing, because you have to stand perpendicular to the fall line if you don’t want to keep moving.
There’s also a fall line in development. The fall line in development is simply what is the easiest thing to do at the time to solve the immediate problem. This can also vary by individual knowledge level.
In C, if you want to add some function, the fall line is just to declare it before you need it, and go on your merry way. If you need it multiple places in the same file, put a declaration at the top of the file. Only if you actually need it in another file is there any reason to expose it via header.
C++ is kind of opposite in this aspect – if you want something to be a class member, it has to be included in the header. So a lot of little internal methods that, in C, may have been completely hidden from the world become exposed, at least to the extent that they’re in your header. Yes, you can do things like separate implementation classes, but that’s, again, adding more work.
Because of this, I firmly believe that in any kind of API development, you want the easy thing to be the safe thing to do. If something takes more effort to do, people will avoid it unless they absolutely need to do it.
Sometimes, the easy thing to do in a language, or application, isn’t the right thing to do. Sometimes it’s the wrong thing to do. And the typical answer to that is to add process, and force people to do the right thing.
I’m going to suggest that all process can do is add inefficiency, and by doing so, change the fall line of development.
That’s not necessarily a bad thing – if the fall line leads you to write bad code, then adding inefficiencies to gain later benefits can be very useful!
But, there’s two things to keep in mind:
First, if people don’t understand the purpose of the process, they’ll just follow the steps of the process while continuing, essentially, the same underlying behaviors. Imposing a process alone will generally not change the mindset of anybody. If you’re really looking to get a change in behavior, you’ll have to do that through education.
Secondly, the reason that process works is because it adds inefficiencies, and people will avoid inefficiencies. The more heavyweight a process is, the less it will be used. If your checkin process takes a day to navigate, then people will avoid checkins, and do them in large batches. And that might partially defeat the purpose of your checkin process in the first place.
You should always use the minimum process that you can, and use process deliberately to add inefficiencies. The goal of a process should never be to get people to do what they should do, but rather to get people to not do things that they shouldn’t.
I’m a big TDD proponent. I would never suggest that somebody institute a TDD process that was mandatory. If you don’t “get” TDD, then all you’ll do is write bad tests, and make the unit test suite less usable to me, without even getting any of the benefits.
Instead, I’d make a process (probably as part of the build) that ran the appropriate suites, and make some kind of check to make sure they were either run prior to checkin, or immediately afterwards.
Yes, I want people to write tests. But the best way to get them to do that is by example, and showing them how useful the tests can be. What I don’t want is for people to break the tests, rendering them useless. And, I don’t want poorly written tests that are likely to fail due to configuration issues, or take forever to run.
If I make users manually run tests, it’s another process that they have to do – they’ll put it off as long as they can. And if there’s a break, they’ll probably have worked well past the original error, and have to rework a lot of code in order to get everything to work again. They’ll end up hating the unit tests, and the quality of the suite will degrade.
If the tests are in another directory, developers will have to sync two directories, and switch directories to build/run tests. Again, this will make them run them less often.
By making the tests run as part of the build process, I make it hard to break the tests. If the tests break, you don’t get a successful build. Now, you have to circumvent the build to get anything done. If it’s your change that broke the tests, it’s probably easier to just fix your code, since you’ll have to do it anyway.
And, if the tests are part of the build, slow or failure prone tests will become painful for everyone – as long as it’s easier to move the tests to the appropriate suite than it is to circumvent the build, they’ll do that.
Permalink
03.19.08
Posted in Uncategorized at 11:55 pm by Kyoryu
And this time I’m going to drift into theory-land for a bit, and I’m not particularly fond of theory-land. I prefer "getting-stuff-done-land," which is quite often located on another continent.
Conceptually, objects are these things that sit in space and receive and spit out messages. And if we look at our User class in that way, it kind of makes sense. We define the behavior of the User object as receiving a message to set its name, and when it does so, it should then emit a message that it should be saved.
And this is a pretty convenient way to think of objects. It also isn’t modeled well by most object-oriented languages.
C++-derived languages handle relationships of "is-a" and "has-a" types very well. But, what we really want here is a "talks-to-a" relationship. There’s no real first-class way to do that.
Also, they handle incoming messages very well, by means of the public interface of an object. But, outgoing messages are a little tougher. Sure, you can do all sorts of things with message receivers and emitters, but realistically, people are going to use the built-in constructs of the language. So, if we want to model objects as message receivers (easy) and emitters (not easy), we should do it in as basic a way as possible.
And now we get back to "getting-stuff-done-land." Looking at our User class from the past two entries, we can look at the interface we define as the outgoing messages that the object can emit. Viewed in that way, it makes sense that the object would own that interface – who else would? And why would an outgoing message from a user know anything about a database?
There’s other ways to model an outgoing message, and they all pretty much boil down to function pointers.
First, you can use virtual protected methods to define messages, and then override them in derived classes to translate them to the incoming messages that are desired by another component/object. This method has the advantages of looking more traditional, but the disadvantage of not being able to mix-and-match receivers at runtime.
Secondly, you can use events and delegates. Combined with lambda expressions in C# 3, this can be an effective way to define outgoing messages. It has the advantage of not requiring additional classes, but the disadvantage is that you have to hook up each event individually. Also, events can provide multiple dispatch, but that can become wonky if you want to remove individual anonymous delegates, so I’m counting that as a neutral.
Permalink
« Previous Page — « Previous entries « Previous Page · Next Page » Next entries » — Next Page »