My blog has moved!

You should be automatically redirected in 4 seconds. If not, visit
and update your bookmarks.

Thursday 18 September 2008

Building Cathedrals from Bazaars

I wrote this as part of my work for Cloudsmith. It is a follow up of my earlier posting a few weeks ago.

In summary, Cloudsmith lets you browse and find useful bundles of software components which work together – software playlists – and then download ones of interest. Each one can contain components from different software repositories, and Cloudsmith knows where to go, and how to get to them.


Eric S. Raymond wrote a seminal paper in 1997, The Cathedral and the Bazaar, contrasting how Linux emerged from a loosely structured, highly collaborative community or "bazaar" with the traditional approach to developing software (open source or proprietary), in which a select group of cathedral-builders controlled every aspect of design and technology.

Most engineers strive to build at least one great “building” during their career, a monument, a shrine, and a testament to their skill. Today, even "cathedrals" are made from parts found at the bazaars - a huge and growing marketplace for open source components, in which thousands of developers promote parts that many other developers combine into new products. The output of many bazaars -- projects and communities such as the Eclipse Foundation, the Apache Foundation, Google Code, SourceForge, etc. - support and publish the efforts of component development teams. Popular components turn up in multiple bazaars, sometimes as identical copies, other times with subtle variations.

Among the challenges development teams, and their co-worker product management and product marketing teams, face when operating within this new ecosystem are:

* What range of components is currently available? Which bazaars have them; what is their status and quality; how popular are they; where can updates and fixes be found; and so on.

* What works with what? What components, and combinations of components, are available? How do the pieces all fit together, and which bazaars have them?

* How popular is this combination of components compared to that alternative one? How do we know when and if we should update a selection of components, as new versions of the constituent parts emerge?

* How can we build playlists which combine components we built ourselves, with components found in public bazaars and that change in ways we don't control? How can we move to the new version of a public component without breaking what we already have? And how can we keep what we found in the bazaar from getting so intertwined with what we built that we can no longer separate them? What is the best strategy to manage change, when your organisation and your team are increasingly mixing public software components with your proprietary assets?

* Who is going to support us when we use some unique combination which we assembled from public bazaars? Is there anyone out there doing something similar we can learn from?

* We fix and extend components we find in the bazaar, and sometimes create entirely new component playlists of our own. How do we share our work with other developers in our organisation or (assuming our corporate policy allows it) contribute things back to the bazaar for the public good? And assuming we've shared it, how do we know who is using it, and for what?

It is, of course, no longer just an issue of providing a stable, managed foundation on which you and your colleagues can build. There is heightened corporate awareness reaching all the way to the audit committees of publicly quoted companies, due to the multiplicity of software licensing policies. The issue of knowing if, when and how public software assets are being used inside a corporation has become a high concern.

The ability to tailor software should be its value rather than its risk. But in todays world, isn't software componentisation paradoxically slower than it could be, precisely due to the changes, improvements and proliferation offered by the community?

Eric Raymond describes how extremely useful software can result from open collaboration, despite the absence of a clear lead architect directing the project. Today’s software repositories illustrate this principle on a grand scale - they are collections of really good and useful components developed, published, maintained and extended, sometimes by individuals and sometimes by organized teams of collaborators, in a process that can seem almost anarchic compared to conventional internal development.

As bazaars of developed, and contributed, software components have matured, the complexity of fitting together appropriate combinations have increased, as has ensuring that things do not break as each component is maintained.

One example is Eclipse, which is a common integration platform for many components. The recent Ganymede release lists nine application frameworks, six toolsets for embedded and device development, six toolsets for enterprise development, five language IDEs, and five aspects of its rich client platform. All of these elements, in principle, can be used in any combination of choice, although there are seven different official Ganymede packages are listed. Forty-five additional different project downloads are listed. And nine different distributions from member organisations are promoted. It shows an impressive level of community momentum and collective activity, but which of all of the alternatives do you really need for your particular project?

Actually, it is even more complex, because each bazaar stacks up components from its own shelves with components it finds in other bazaars. And you are often building not just one cathedral, but several based on a common set of blueprints. Perhaps you want to develop using Seam rich client Java toolkit? Then you might need a playlist of the Eclipse Classic IDE, JBoss Tools, Seam Core, JBoss AS, and PostgreSQL (with thanks to Stefan Daume for suggesting this particular playlist). But to do so, you may need to visit the Eclipse, JBoss, Seam and Postgres bazaars to put this all together -- unless you can happen to find somebody else who has already done this for you. If you want to build an email spam filter, then maybe a playlist of MySQL, qpsmtpd, my qpsmtpd custom modules, php pages (status), and open flash chart run-time files might be just the job (with thanks to Bjorn Freeman-Benson for this playlist).

Finding out what software components are available is a modest challenge: you can use raw Google, or Google CodeSearch, or Koders, or Krugle, or Codase, or something similar. The more significant challenge is finding out what works with what else to form a useful playlist; then how to get hold of the right version of each these pieces from each of the right bazaars concerned; how popular is this specific playlist of components; and how to get notified if any of the pieces are subsequently changed. If you want to be civic-minded, you might also want to find out how best to contribute original or derivative works back to the remainder of your organisation or community at large.

Our industry is maturing: we really soon should reach the equivalent levels of professional practice as our colleagues in other engineering disciplines, such as electronics hardware and civil engineering. There now is - perhaps at long last - a substantial number of re-usable, well-engineered, components available to all of us, being extended and improved on a daily basis. We should all be able to build cathedrals, and other artifacts, from the components we find. But the vast range of components, coupled with the fluidity of material - software - with which to work, has presented our industry with some new challenges,and which are not as apparent in other engineering disciplines.

1 comment:

Charlie Federman said...

Hey Chris,

Great, and insightful post. I agree with many of the points and linked to it at