August 31, 2003

why distributed computing?

why is distributed computing such a big deal? My focus here is for enterprise computing systems, not internet based systems. The dream of distributed objects and "n tier" architectures was to have all of these services floating around. I was a huge proponent, it's probably what made me want to be a programmer. Today I'm asking - why? WHY do they need to be distributed?

Why can't they be co-located? Can we not design things that distribute because there's an actual reason behind it other than politics?

The over-complexification of software is continuing its takeover of our enterprise systems. Why can't we take the simple solution to XYZ problem by writing 10 stored procedures, wrap it with a SOAP service, and call it a day?

I see way too many systems that require physical component seperation if they're going to get any logical seperation. (A client I'm dealing with now has this problem: dozens of CORBA services, an EJB cache, an MQSeries-based bus, a JMS-based bus somewhere else, and pieces of C++, Java, and .NET code everywhere. All on, literally, hundreds of physical servers costing millions of dollars. The amount of hardware resources being squandered is staggering for what piddly performance they get.

Greg Pfister, author of In Search of Clusters, had a "standard litany" of "why go distributed". Scalability was the big buzz word, though I rarely see any systematic analysis of scalability beyond a few load-test scripts on any particular island in the distributed system - rarely on the end-to-end distributed system itself. Pfister's main point is still correct: the reason we don't have scalable & reliable clusters is that the clustering software still sucks. And yet we think our enterprise developers can do a better job than the cluster vendors. In our enterprises, we only have clustering software on the "islands". We don't leverage it in a larger sense by recombining our logical components into cluster nodes and designing the distributed system in a truly co-ordinated manner. (What's fascinating about this is that Gartner actually predicted this trend -- that we'll cobble our own approaches vs. taking a vendor's clustering approach to a large scale).

So here's my theory: It seems that many developers and architects need that physical split to wrap their heads around logical separation of concerns.

We've seen this kind of mental block for years in other areas of software: the need for physical libraries vs. logical modules. The inability for people to see data as "set oriented" and requiring pointer-based access to data, or to use a cursor to access data one-record-at-a-time. There are other examples.

All of these islands based on "skill sets" pop up here and there. I've seen it happening with CORBA - you have a C++ island here, a Java island there, etc. Now we have a .NET island somewhere else, a MOM island that bridges to CORBA, and an EJB cache that talks CORBA and MOM, with a bridge on the MOM that talks XML. It's all of these little "skill set islands" being deployed out into the enterprise without any attention paid to the fundamental problems with such an approach: tremendous complexity in debugging, latencies between each component that limit scale, and reliability concerns because each island can fail independently, has their own recoverability mechanisms, and the most planning I've seen is the "hot standby" + a DR site per island. Usually it's just a cron job that restarts the failed process. This is staggering! Though perhaps this "distributed shanty town" actually is what enterprises really want. I guess time will tell, if the reliability problems bite them enough economically. Thus far, I'm skeptical. Look at the current virus problems endemic on the web, especially with certain e-mail systems. What are people doing about them?

My feeling is that we're going to see service outages and tremendous scale problems in many enterprise systems. I hope I'm wrong. I hope that the relability concerns I see are really just an overreaction to the changing economics of software development, and the "shanty town" approach is really just the emergence of a new "ecosystem".

Now Microsoft has adopted this model of the "interoperable island" as their way of touting Windows in the enterprise, which is a good thing in a way. In the past, you were locked in the Microsoft world and couldn't talk to anyone else. Today you're still locked in, but now they're happy to let you talk to others.

So .. as Gerry Weinberg once said, once you solve problem #1, you're the one responsible for promoting problem #2. Problem #1 was to eliminate platform & language religion from distributed interoperability. Problem #2 (in my mind today) was pushing distributed computing for systems that didn't need to be distributed. Thus non-technical reasons such as divergent skill sets and the inability for develoeprs to make a logical/physical mental split became the rationale behind such designs.

We deserve it, I guess.

Maybe this is a good thing: the evolution of a "city plan" based approach to IT architecture, in Gartner's terms. The manager in me thinks it's probably a good thing. The technologist in me is frightened, because the only way I've seen end-to-end systems problems solved is by someone that could see the whole picture - forest and trees - and fix the problems locally. That's a rare quality... and something I see lacking: true architectural guidance that doesn't wind up being the hated "design cops".

Posted by stu at August 31, 2003 08:56 AM