January 28, 2006

SOA's technical landmarks

I think there's a lot of curiousity about what has led people towards SOA as a preferred architectural style for distributed computing. Besides market and business factors, especially SOA's focus on IT governance, which are likely the primary reasons, there are big, solid technical reasons for the shift, in my opinion.

I think the technical reasoning is three-fold: firstly, SOA recognizes and re-uses the most applicable facets of object-orientation to a systems-wide case. Services are definitely not distributed objects, but they retain a few basic facets of the general object oriented paradigm. These facets being the primacy of extensible message passing with all of its implications, and the importance of focusing on designing interactions between objects (instead of their internals) when trying to construct an evolvable, growable, and interoperable system. Alan Kay, Smalltalk's father, dropped this nugget of insight 8 years ago:

I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea.

The big idea is "messaging" -- that is what the kernal of Smalltalk/Squeak is all about (and it's something that was never quite completed in our Xerox PARC phase). The Japanese have a small word -- ma -- for "that which is in between" -- perhaps the nearest English equivalent is "interstitial".

The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. Think of the internet -- to live, it (a) has to allow many different kinds of ideas and realizations that are beyond any single standard and (b) to allow varying degrees of safe interoperability between these ideas.

The second reason SOA is so important is that it recognizes the long fought, hard won (and still not decided) battle that distributed computing is fundamentally different from local computing. To me, the watershed paper in this debate, now a classic, is Sun Microsystems Labs's 1994 paper A Note on Distributed Computing. I recall in 1996 the debates on the (sadly defunct) dist-obj mailing list about the importance of this paper, and how it shattered a number of the (then prevalent) CORBA and DCOM assumptions. Its major point was that distributed system endpoints require explicit boundaries to deal with the fundamental differences in latency, relability, availability, concurrency, and memory access when moving from local computing to distributed computing.

SOA doesn't have any explicit approaches to dealing with the above, other than recognizing that you have to. A service is the combination of implementation, interface, and contract, which contains the "rules of engagement". A contract is a mapping of service implementations to standard, well-understood "policies" for interaction - the mesage exchange patterns, the availability, reliability, latency, and expected volume characteristics, and how these policies are realized through the service interface.

Explicit contracts and policies, even if they aren't automated, are useful because it guides people to the correct usage of both legacy technology and newer technology. Progress towards automated policy enforcement will be slow as we're still mired in the muck of yesterday: SOAP/WSDL's RPC heritage, MOM's proprietary transport and fixed-message-format heritage, and Java Remote Method Invocation (RMI), which in practice missed important aspects mentioned in the paper, such as dealing with concurrency and interoperability, not to mention the myriad security, reliability, and availability standards and facilities out there.

Finally, SOA acknowledges the importance of shared data semantics for interoperability. A lot of the work in data warehousing community is important here, for they were the first real world attempt to integrate disparate systems under a common umbrella. Building practical enterprise canonical data models is absolutely necessary to ensure interoperability in SOA. The point is not to create a universal model for all audiences, the point is to ensure that groups of services that hope to interoperate must have an explicit mapping between their interface's representation and semantics and some other canonical representation and semantics. This may involve deterministic mappings, as would be the case with most transformation technologies, but it also may involve probabalistic mappings, as would be the case with search technologies or data cleansing/matching engines.

Posted by stu at 10:30 AM

January 14, 2006

The open source hype cycle

John Mark Walker wrote an interesting article on O'Reilly's OnLamp site, entitled There is No Open Source Community. His argument, in a nutshell, is that many people view "open source" as an ideologically-led community, but really, it's not. The economies of scale in the software industry, made possible by the internet, are what are pushing open source forward. I responded to him on Slashdot, and am adapting that response here.

The first thing I'll note, is that in a recent (mid-October 2005?) Gillmor Gang, I remember that Doc Searls made a very similar comment -- "there is no open source community". Sure, there are communities, but they're a loose federation at best. There's no driving agenda, no cabal guiding the efforts.

Second thing is that I generally agree with the article, though I think he takes the economic arguments a bit too far. Classical economics has a major bullshit quotient; it's a useful analytical tool but is usually over-applied. I do agree that OSS would not be where it is without the Internet, but that could be said of most things in the tech world, so it's somewhat of a banal point. Slightly more interesting, I think OSS wouldn't be where it is today without the captial influx from both public and private capital (VCs & public companies). Most full-time contributors on popular projects are on corporate payroll, which is being funded either through complementary products (hardware, consulting, support) or is just a capital sink until they figure out how to make money with it.

I have my own view on the role of ideology in promoting open source. It's a strawman, but it seems to be the pattern I'm seeing.

There is no core group of ideologues that really matters anymore. Perens and ESR did good things to hype OSS in the late 1990's, but I don't think they're doing much now to increase its hype. Today, the hype cycle is fed by a large group of in-the-trenches developers that are ideologues because their don't get much personal value out of their jobs and are trying to attach themselves to a larger cause. They're frustrated with the proprietary software they're forced to use that just doesn't work the way they want it to (regardless whether their way is actually better). This leads mostly to pro-OSS postings on blogs and websites, like Slashdot, TheServerSide.com, O'Reilly Network, or whatnot.

These posts, along with their voice on projects, eventually leads to influence thought leaders inside and outside their company, looking for the next trend to exploit. Joe Developer will promote the OSS-solution-du-jour for their project, and explain its wonders to his team leads and the public, mostly based on cool-factor and some anecdotal statements about its productivity. Examples abound, such Ruby on Rails, or MySQL + PHP, or the plethora of Java frameworks.


Comment: I'm not challenging that these tools actually make life better at times, but I am concerned with two things: the influence is usually based purely from a narrow "professional lens" -- I'm a developer, I only care about developer values, and I choose tools that make me feel more productive or cool, regardless of consequences outside my area of expertise. Business factors (which often are also architectural factors) are rarely considered. In this, I agree with Mr. Walker. Secondly, that there is such chaos and splintering in the market going on due to OSS development that quality is suffering. People are going "meta" and developing more and more tools for themselves instead of using old, proven tools that have lost the cool-factor, or might be proprietary.


To continue the story, these in-the-trenches IT or ISV developers influence their team leads, who, in smaller companies with less bureaucratic oversight on licensing / legal concerns, influence their directors, and open soruce gets used on a project. Successes are bound to occur, especially if the requirements are modest, and performance demands are light, and availability requirements loose. Pundits and bloggers pick up on these modest successes and run with it, claiming that all infrastructure software -- operating systems, databases, application servers, will be inevitably open source.

Comment: My point is not that OSS can't do complex, highly available, performing software, it's that such high profile successes certainly require more research, planning and investment. As an example, look at ZDNet's blogs some time -- or the Gillmor gang podcast. They get paid to be provocative, no question, but they've been on a path for over a year now suggesting that all software will become a service, and behind the scenes it will be all open source. They're looking at Google as an example of this , brushing over the tremendous braintrust required to design, build, and maintain that infrastructure. To paraphrase Jamie Zawinsiki, open source is free only if your time has no value.


Anyhow, executives and investors read these articles and blogs, and start questioning what's going to happen to Oracle, SAP, Microsoft. And they may invest in open source startups as a hedge. And some of those in the trenches developers may actually quit and go work for an OSS startup, increasing the hype cycle.

That's my strawman of how ideology affects the software market: it creates a perception of strength that isn't actually there, yet such dissonance is a needed starting seed of all new business models and markets, so I can't really fault it. But there will be a backlash. Open source that makes business sense will thrive, that which doesn't will remain a niche. I don't forsee a complete overthrow of the proprietary software market... I tend to agree with BEA's (my employer) approach of blended open source. But beyond us, Oracle in particular is so damn huge now, they've made a huge bet that companies will turn to large single-source software infrastructure and applications providers. I can't think they're completely wrong, even if I don't entirely agree with that model.

Posted by stu at 12:30 PM