“Software components enable practical reuse of software ‘parts’ and amortization of investments over multiple applications. There are other units of reuse, such as source code libraries, designs or architectures. Therefore to be specific, software components are binary units of independent production, acquisition and deployment that interact to form a functioning system. Insisting on independence and binary form is essential to allow for multiple independent vendors and robust integration”
Clemens’ book goes on to examine the different component systems like CORBA, DCOM, JavaBeans and more.
I’d say that in most people’s minds, the goal of component software is to approximate the Lego experience. We want to assemble building blocks to make something. Of course there’s a difference, we also think of ourselves as Lego block creators able to whip up a new component at will. This is probably the first mistake. Designing components is 10 times harder than using them. I’ll extend that to say that designing component frameworks is 10 times harder than designing components.
My approach to software engineering is that for most problems the highest priority is maintainability of the code. As such my radar is always watching for complexity that has little to do with the problem and a lot to do with the plumbing. In my experience, most component frameworks look simple at first but hide a ton of complexity that is all about the plumbing. It’s relatively easy to use a COM component, not hard to build one but things get messy very soon after that. COM has a lot of rules which is another way of saying that COM has a lot of patterns and anti-patterns that are easy to get wrong.
When people confuse object oriented languages for component architectures then things get even worse. If the size of the rule book is an indication of the potential for problems then C++ surely takes the cake. No C++ developer ever truly knows all the rules and there is often dissent about the rules in the first place. C++ works best when the development team has access to all the source, can debug through every class and someone is thinking about the whole.
The C++ problem is actually at the root of why I think most component architectures are flawed. Many of them have roots in object oriented programming environments like C++. Most of them have the word “object” in their name somewhere. Designers wanted to extend object oriented programming concepts and beliefs into a Lego compositional model of software.
Remember when Windows DLLs were considered components?
Anyone who has written code for a while knows two things: we all pine for a Lego composition model and it doesn’t exist yet.
Object oriented programming, modularization, encapsulation are all code organization conventions intended to make large code projects easier to manage. The shared theme is a good one, we can do great things if we can abstract messy details away.
Component software is at its worst when developers start to believe they can upgrade components in place without affecting dependent applications or components. I wish I could explain in formal terms what experience has shown me over and over again. The more complex the interactions between the components or the more complex the change the more likely it is that it will have unexpected consequences. Consumers of components consume not just the published interface but the real-life behaviour. Every bug and timing characteristic is a part of the contract between components and every security fix, upgrade and improvement runs the risk of changing that.
I’ve been bloodied and humbled by my own efforts in this space. I’ve worked on frameworks, components and applications and learned some hard lessons along the way. The outcome of this is what I call Michael’s rules of component software:
The goal of components is to allow isolated and stable modules of software that were created independently to be composed to create a solution.
Testable on their own
As useful as they are stable
I believe that there are only three good ways to compose components:
Files. When two components communicate via a file then the file format becomes the interface. It’s super easy to test and debug and it’s very hard to introduce time sensitive behaviour. Obviously a lot depends on the file format. Simple text files are better. Sending code (even when disguised as xml) via files is cheating.
Command lines. Sometimes files aren’t quite enough. You need the “answer” now or you find yourself adding complexity to the file format just to parameterize what is clearly a request. Command lines allow you to extend the file model, they preserve the process boundary and again, they are pretty easy to test in isolation. The standard unix command set represents a very stable framework of components upon which many applications have been built.
REST over HTTP. When multiple clients are accessing the same shared resource, command lines just won’t do. This happens most often over a network but can also happen on the same machine. REST is the expression of a lot of real world experience about network and server components. You have to get very formal about how state moves around. Although REST doesn’t require HTTP there are rarely good reasons to invent a different protocol and plenty of problems when you do.
There are exceptions to this rule. There are legacy protocols like SMTP that have been sufficiently baked that there’s no point in “fixing” them. There are new components like memcached that warrant dedicated protocols instead of HTTP (although I believe that memcached is still RESTful). The simple approach is to be very deliberate when departing from the above rules.
Updated July8th 2009
I think it’s time to add XMPP to the list as #4 but with a lot of a caution. The scenarios where XMPP is truly needed are likely to be quite complicated and more likely to result in distributed code than truly separate components.