Posted by & filed under Complexity, Emergent Design.

My particular area of interest in software these days is the importance of levels of abstraction above the raw code. In Java, the most natural place for these to manifest themselves is through the package structure (though this is certainly not the only possibility).

Recently I used Structure101 to do some analysis on the evolution of the findbugs code-base, and was rather stunned at what I saw. Here is the root dependency graph for the first public release (0.7.2) back in March 04.

This diagram shows us the top-level packages in the code-base and the interactions between them (the numbers beside the arrows denote the number of code-level references).

With just a little knowledge about what findbugs does (it is a static analysis tool that scans bytecode for potential bugs), it is easy to rationalize about how this code-base is internally architected:

  • graph is a re-usable (baggage-less) data structure to model the control flow within a method body
  • visitclass wraps a bytecode parser with a visitor pattern to shield the parser implementation from the interpretation of the parser callbacks
  • ba (bug analyzer?) is the bit where specific rules (policies, strategies, …) are implemented
  • findbugs is the controller that drives the interactions between the other components

The image below again shows the top-level breakout, but this time several releases later in Oct 04 (0.8.6).

Although the code-base has grown significantly, it is still absolutely possible to rationalize about the architecture and where the new pockets of code (annotations, xml, config, etc) fit into the “big picture”. The only apparent blemish is that io is now disconnected (perhaps dead code).

The first significant imperfection creeps in in April 05 (0.8.7).

The relationship between config and findbugs has become blurred. Since both packages are dependent on each other, it is no longer clear that either of these in isolation represent meaningful or useful abstractions, and it may make more sense to think about the relevant “component” as being the union of the two.

Skip forward just a month…

… and the confusion has spread (0.8.8). The filter package has been added but it too has a 2-way dependency with the findbugs package, so it seems reasonable to say that the whole world of controller and config (incl. filtering) is in essence a blob where the individual packages do not really contribute anything in isolation.

There is also a rogue dependency here from ba to findbugs. This is clearly contrary to the original architectural intent.Ā  The weight is just 2, so this would have been very easy to reverse out had it been spotted.

If we fast-forward a year (1.0.0), however, we see that this rogue dependency has become entrenched (the weight has increased from 2 to 99).

Moreover, more and more packages are being pulled into the tangle such that it is hard or impossible to talk about these as meaningful entities in their own right. For example, what is the point of a util package if it contains code that depends on the findbugs package?

Nevertheless, we can still see evidence of meaningful architectural decisions. For example, the bcel and asm packages are presumably wrappers for the BCEL and ASM bytecode engineering libraries that, together with the classfile package, enable an element of plug&play in terms of which library actually gets used for the analysis.

However, moving on to Nov 07 (1.3.0)…

… we see that these too have been sucked into the tangle. From now on, it seems, all testing, deployment etc. will need to include both.

And here is the most recent snapshot from September 08 (1.3.5):

This diagram doesn’t help any more – nearly all the higher-level abstractions appear to have eroded away. Moreover, a peek under the hood reveals that there is a large code-level tangle involving 43% of all the classes and spanning 33 packages – this implies that the interdependency has become deeply entrenched in the code. Shame…

For a quick view of the full history, I did up a little animated gif showing the “progression” through all 27 releases. If you are interested in something meatier, see the “Structure101 in a Nutshell Part 1” presentation on the Headway Take a Tour page.

15 Responses to “Software erosion in pictures – Findbugs”

  1. Zack

    Perhaps the start of a source visualization findbugs-style project? šŸ™‚

    Good stuff, what did you use to create the graph images?

    Reply
  2. Emeric

    I agree that cyclic dependencies is a bad smell, except perhaps when there is a large dependency and a small backward dependency.

    But what are your arguments to say that 2 or 3 interdependents packages are a “blob” which you imply is bad ?
    Don’t these packages deserve a name in their own and an isolation of comprehension even if they have cyclic dependencies ?
    I think they initially deserve the separation, even with cyclic dependencies.

    May I ask you opinion on the possibility to refactor, in the future, findbugs with Java Modules and friends packages ?

    Reply
  3. Ian Sutton

    @Rahul
    That is very likely overstating things (though smiley duly noted). The impact of this kind of erosion will be predominantly on the internal development – I would speculate that their rate of change has degraded over time and the level of coupling at the class levels leads to more side effects (though tests can counteract that in many cases). But this is not the same saying that they should tear it up and start again.

    There is a paper on this subject in IEEE Software, looking at a couple of other open source Java code-bases. Paraphrasing liberally, the thrust here is that there were recurring phases (called “structural epochs”) in the code-base evolution where development of new features largely ground to a halt while the team was striving to rescue the architecture (combat erosion). In the cases, observed, however, these efforts were only moderately successful, and were generally more about moving excessive complexity around rather than actually reducing it…

    Reply
  4. Otavio Ferreira

    Hello Ian. I really like your post, it’s very well set out.

    I’ve been studying dependencies among modules for a while, and I must say that I really understand the problem you’re presenting here, as well as your suggestions that would make the architecture depicted above better. BTW, Structure101 seems to be a very interesting software indeed, good stuff!

    Ian, you might be interested in reading other two post I wrote for the MIH SWAT blog, namely Object Orientationā€™s Worst Enemy and Dependency Inversion: Killing Gorillas and Butterflies, as they talk about the same subject. Any thoughts?

    I kept our conversation going on Good software modularity. What exactly is it?.

    Let’s keep in touch. Cheers!

    Reply
  5. Raoul Duke

    very awesome post.

    …up until watching the animated GIF. the cube rotation thing does not add didacticism, it pretty much destroys it. please, please, please, meditate deeply upon your sin and post one that just fades between stages, or something like that. šŸ˜‰

    Reply
  6. Ian Sutton

    @Raoul Duke
    Thanks for the compliments and the beautifully couched criticism šŸ™‚
    Between you and me, I’m crud at graphics (perhaps that’s why I like tools that draw things for me). But I had another go and perhaps this one is better.
    Hardest bit is the juggling between trying to have something which is quick and snappy (< 1 min) but at the same time useful – animated gif may have been a bad call there. So I also did yet another version, but this time as swf (with a pause button).

    Reply
  7. Raoul Duke

    @revisions

    cool. at least one important difference between you and me is that you actually *do* things; thanks for being somebody who really does stuff!

    (some day down the road in your future copious free time i wonder if you could animate the graphs morphing along, so there’s no fade even? i’d hazard to guess that there is some code lying around somewhere on the web to let one do something like that, although i certainly don’t know where.

    or maybe simply having them lined up vertically statically one after another in a long strip of images would let people scroll the document to easily compare? like, if 2 (or 3?) fit on a screen at a time? maybe at 3 then they’d have to be horizontal since most monitors are wider than they are tall.)

    Reply
  8. David Hovemeyer

    You’re right – the architecture of FindBugs is not pretty. I’ve occasionally thought about doing a reimplementation, but it would be a huge amount of work.

    Reply

Trackbacks/Pingbacks

  1.  Sean Coughlan - thecentric Ā» Blog Archive Ā» Software erosion in pictures
  2.  The parable of the two watchmakers | Deconstructing Software
  3.  Software erosion and package tangles « sutts on software

Leave a Reply

Your email address will not be published. Required fields are marked *