Saturday 15 October 2011

Handling Software Design Complexity

I decided to start my first real post with the subject I consider to be at the very core of software development, but which is rarely discussed.  But to get started I ask you this question:

What is the main problem facing software development?

Ask a progammer, a software project manager, a systems analyst, a system architect, an academic or almost anyone who knows anything about software development and you will get various answers along the lines of:
  •  insufficient analysis and poorly understood requirements
  •  poorly documented requirements
  •  poor communication, especially of requirements
  •  poor estimates
  •  unrealistic deadlines (often due to poor estimates)
  •  inadequate testing (often due to unrealistic deadlines)
  •  changing requirements
  •  unmaintainable software (often due to frequently changing requirements)
(Please post any more below, if you think I have missed something important.)

If you look at all these answers you can see that the root cause is simply due to complexity.  The complexity of the original problem makes it hard to understand and communicate to others, which has the knock-on effect of poor estimates, etc.  Coping with change just magnifies the complexity.

The significant problem of unmaintainable software is I believe also due to complexity.  This can, and often is, due to large amount of duplicate and even unnecessary code.  However, the complexity may just be due to the design of the software, or simply that the design is not well understood - after all complexity is just a matter of understanding.

Coping with Complexity

So how do we cope with complexity?  In brief, and in general, we use a fundamental principle that has been used for thousands of years - divide and conquer.  An example of its importance is the division of labour, without which the rise of human civilisation would not have even been possible.

How can we use the divide and conquer principle in software development?  You have probably already guessed that the answer is we already do so, in a myriad of ways.  Thousands, if not millions, of people have been involved in creating all the things that go towards someone creating a piece of software - from the people who created the operatings system, compiler, run-time libraries etc, through to the hardware etc, all the way to your mum, who made the breakfast that keeps your neurons firing.

The principle of divide and conquer is also at the core of the actual design of software.  In fact just about everything you have or will ever read about creating software is concerned with the best way to break down a problem into manageable sub-problems.  This is also known as information hiding, modules, black box, separation of concerns, encapsulation, decoupling, cohesion, modularity, abstraction.  Its also the primary concern of modular programming, orthogonality, abstract data types, component-based software engineering, and object-oriented programming.  In a way it is even the basis of the most important paradigm in the history of programming, that of structured programming.

Of course, there are other ways to deal with complexity, but divide and conquer is by far the most important.  Before I look at some others though, I need to back-track a bit to consider the nature of complexity.

Understanding

The human brain is very powerful but it is limited in that it can only process a small amount of information at a time.  Luckily, due to its ability to learn and memorize, it can accomplish many tasks, just not all at the same time.  (Notwithstanding individual differences, which may have a slight sexual bias, I think the male and female brain are essentially the same in this regard.)

My definition of the complexity of a problem is how hard it is to understand.  Given a complex problem we can often divide it into sub-problems, so we don't need to understand the whole problem at once.  That is, given simple enough sub-problems the brain need only understand each individually, as well as how they are combined to form the main problem.

So even if a problem is too complex for an individual to solve, if it can be broken down into simple sub-problems each of which can be solved one at a time, then the total solution can eventually be solved.  In fact, different sub-problems may even be solved by different individuals.  Again, this is the principle of divide and conquer.

I mentioned above that there are other ways to deal with complexity.  Though I am not an expert on this I can think of a few:

* simplification, or removing information superfluous to the problem
* analogy, or finding a similar problem that is easier to conceptualize
* symbols, since the brain is highly visually-oriented, things like diagrams and notations may assist understanding
* commonality, or recognizing sub-problems that have the same solution

Conclusion

In brief, by far the main problem faced in creating software is complexity.  The main way we deal with complexity is using the principle of divide and conquer.  The main approach to creating software is (using divide and conquer) to create small well-defined components that perform one simple task and that expose a simple well-understood interface.

This is much easier said than done, since the best way to split a problem into sub-problems is not always obvious.  To some extent you have to obtain an understanding of the overall problem to be able to recognize the sub-problems and how to minimize the coupling between the different parts.  Moreover, some problems seem to be non-decomposable, though I believe that there are few if any problems that are inherently so.  Understanding of techniques and previous experience with analyzing similar problems is important for anyone designing software.

The study of the best way to create small, well-defined components (ie the best way to do the dividing) is the principle concern of software designers, from system architects down to lowly programmers.  It is the raison d'etre for almost all the various programming techniques, concepts, processes, paradigms, etc that are continually being invented.

Friday 14 October 2011

Introduction

In 1993 the UTS (University of Technology, Sydney) started offering a part-time, post-graduate course in SQA (software quality assurance).  I had been working as a programmer for ten years but doing this course really opened my eyes to a whole lot of things I was not even aware of.

One of those things was software development methodologies, since one-sixth of the course was devoted to that very subject.  Of course, I had encountered some of the highly detailed waterfall-based methodologies in the past, but the reality was that nobody really used them and all the development I had done (in both small and large organizations) had been a little bit structured (waterfall) but mainly ad-hoc.

This was why I was very interested in the so-called iterative methodologies that were appearing at that time.  Being a team-leader for a small project at the time I actually tried very hard to use Barry Boehm's "spiral methodology", which I believe became the basis (or one of the bases) for IBM's Rational Unified Process (RUP).

Unfortunately, my experiment with methodologies turned out to be disastrous and the project I was working on was cancelled, for which I take full repsonsibility.  The problem was probably more my inexperience than anything but at the time my feeling was that Fred Brooks might have been right when he said there is "no silver bullet".

The Blog

Anyway, a lot has happened since then (in particular, agile methodologies), so I have decided to create this blog on different ideas and experiences I have since had about software development.  (I already have a blog on the development of my freeware/shareware product HexEdit, but there was a lot of stuff that I could not squeeze into that context.)

So, I think this blog will mainly be about development methodologies.  It will probably deviate into other areas of software development like architecture and other aspects of software quality assurance.