Prof. John Ousterhout in his book A philosophy of software design defines complexity as the phenomenon a developer experiences when trying to achieve a goal. As such, complexity manifests itself in three forms
- Change amplification: a small change to one part of the code requires large changes in the rest of the system
- Cognitive load: a developer needs to know a whole lot before they can contribute to the code base
- Unknown unknowns: it is not obvious which pieces of code needs modification in order to complete a task
Prof. Ousterhout defines the complexity of a system by a sum of the complexity of its components weighted by the fraction of time developers spend working on that component. Thus even though a system might contain an extremely complex component, if it rarely need to change, this component does not contribute to the complexity of the system.
Complexity is caused by dependencies and obscurity. A piece of code is considered dependent if it cannot be understood or modified in isolation. Obscurity is when important information is not obvious.
To combat complexity Prof. Ousterhout offers several useful guidelines
- Strategy over tactics: complexity is incremental and accumulates over time as a result of tactical programming. While tactical programming results in short term productivity, thinking strategically in programming pays off relatively quickly and has higher return on investment.
- Modular design that decompose a system into a collection of modules that are relatively independent results in simpler systems. Clear interfaces between the modules indicate all the information that the developers need to know in order to use a module.
- Deep modules result in simpler systems. The author argues that the interface of a module contribute to the complexity of the system thus this complexity has to be trade off with the functionality provided by the module.
- The UNIX I/O mechanism (open, read, write, lseek, close) is an example of a simple interface and more functionality as opposed to Java’s I/O that provide three separate classes to read a serialized object (FileInputStream, BufferedInputStream, ObjectInputStream). The author argues that the most common case should be served with a simple interface rather than catering (too much) to the uncommon cases.
- Avoiding information leakage between modules is another way to reduce complexity. As an example, consider two classes that both have the knowledge of a particular file format. If the file format changes, both classes would need to change. Re-organizing modules so that such knowledge is localized is a good way to avoid leakage
- Introducing layers of abstractions that are similar to each other is another cause for complexity. In order for an element to provide a net gain against complexity, it must eliminate some complexity that would be present in the absence of the design element.
- When trading off the complexity of the interface with the complexity of implementation, it is a good practice to pull complexity downwards. That is, it is more important to have a simple interface than to have a simple implementation.
- Defining the semantics of functions such that there is little need for exceptions and special cases. As an example, Java String’s substring method raises an exception if the provided indexes are out of range. This causes the calling code to put in additional checks and error handling driving overall complexity. On the contrary, Python’s list slicing returns an empty list if he index is out of bound resulting in relatively simple code.
- Designing a system twice, writing comments that complement codes, good naming, and consistency are other useful tools to counter complexity.
In conclusion, Prof. Ousterhout outlines the sources of complexity in software systems and provides several useful ways to identify and counter complexity. These guidelines could result in more more work in the beginning of a project but pays off relatively quickly and results in simpler, maintainable, and flexible systems.