Complexity, Coupling and Cohesion

Denys Poltorak
ITNEXT
Published in
8 min readMar 14, 2024

--

Any software system that we encounter is very likely to be too complex to comprehend all at once — the human mind is incapable of discerning a large number of entities and their relations. It tends to simplify reality by building abstractions: as soon as we define the many shiny pieces of metal, glass and rubber as a ‘car’ we can tell ‘highways’, ‘parkings’ and ‘passengers’ — we live in a world of abstractions which we create. In the same way, the software we write is built of services, processes, files, classes, procedures — modules that conceal the swarm of bits and pieces we are powerless against. Let’s reflect on that.

Concepts and Complexity

Any system comprises concepts — notions defined in terms of other concepts. For example, if you are implementing a phonebook, you deal with first and second names, numbers, sorting and search, which one must always keep in mind for any phonebook-related development task — just because requirements for the phonebook are described in terms of those concepts and their relations.

In the code high-level concepts are embodied as services, modules or directories while lower-level concepts match to classes, API methods or source files.

Concepts are important because it is their number (or the number of the corresponding classes and methods) that defines the complexity of a system — the cognitive load developers of a system face. If programmers grasp in detail the behavior of a component they work on they tend to become extremely productive and are often able to find simple solutions for seemingly complex tasks. Otherwise the development is slow and requires extensive testing because people are unsure of how their changes affect the system’s behavior.

Figure 1: Complexity correlates with the number of entities.

Modules, Encapsulation and Bounded Context

Let’s return to our example. As you implement the phonebook you find out that sorting and search are way more complex than you originally thought. Once you prepare to enter the international market you are in deep trouble. Some telephony providers send 7-digit numbers, others use 10 digits, still others — 13 digits (with either “+” or “0” for the first character). German has “ß” which is identical to “ss” while Japanese uses two alphabets simultaneously. Once you start reading standards, implementing all the weird behavior and responding to user complaints you feel that your phonebook implementation is drowning in the unrelated logic of foreign alpabets full of special cases. You need encapsulation.

Enter modules. A module wraps several concepts, effectively hiding them from external users, and exposes a simplified view of its contents. Introducing modules splits a complex system into several, usually less complex, parts.

Figure 2: Dividing a system into modules, bounded contexts highlighted.

The diagram has several points of notice:

  • Modules create new concepts for their public APIs.
  • The API entry points add to the complexity of both the owner module and its clients.
  • The total number of concepts in the system has increased (from 18 to 22) but the highest complexity in the system has dropped (from 18 to 15).

Here we see how introducing modularity applies the divide and conquer approach to lessen the cognitive load of working on any part of a system at the cost of a small increase in the total amount of work to be done.

In our phonebook example the peculiarities (including case sensitivity) of the locale-aware string comparison and alphabetical sorting of contact names should better be kept behind a simple string comparison interface to relieve the programmer of the phonebook engine of the complexity of supporting foreign languages.

Modules represent bounded contexts [DDD] — areas of the knowledge about a system that operate distinct sets of terms. In the case of phonebook the collation and case sensitivity do not matter for the phonebook engine — they are defined only in the context of language support. On the other hand, matching a contact by number is not defined in the language support module — that term exists only in the phonebook engine. It is the complexity of the current bounded context that a programmer struggles with.

Apart of dividing a problem into simpler subproblems modules open the path to a few extra benefits:

  • Code reuse. A well-written module that implements something generic may be used in multiple projects.
  • Division of labor. Once a system is split into modules and each module is assigned a programmer, development is efficiently parallelized.
  • High-level concepts. Some cases allow for merging several concepts of the original problem into higher-level aggregates, further reducing the complexity:
Figure 3: Merged two API concepts of the green module.

For example, the original definition of a phonebook contained first name and second name. Once we separate the language support into a dedicated module, we may find out that various locales differ in the way they represent contacts: some (USA) use ‘first name + second name’ while others (Japan) need ‘second name + first name’. If we want to abstract ourselves from that detail, we should use a new concept of full name which conjoins first and second names in a locale-specific way. Such a change actually simplifies some of the phonebook’s representation logic and code as it replaces two concepts with one.

Coupling and Cohesion

We need to learn a couple of new concepts in order to use modules efficiently:

Coupling is a measure of the number (density) of connections between modules relative to the modules’ sizes.

Cohesion is a measure of the number (density) of connections inside a module relative to the module’s size.

The rule of thumb is to aim for low coupling and high cohesion, meaning that each module should encapsulate a cluster of related (intensely interacting) concepts. This is how we have split the system in figures 2 and 3. Now let’s see what happens if we violate the rules:

Figure 4: The upper modules are tightly coupled.

Splitting a cohesive module (a cluster of concepts that interact with each other), yields two strongly coupled modules. That’s what we wanted, except that each of the new modules is nearly as complex as the original one. Meaning, that we now face two hard tasks instead of one. Also, the system’s performance may be poor as communication between modules is rarely optimal, and we’ve got too much of that.

Figure 5: The lower module has low cohesion.

What happens if we put several clusters of concepts in the same module? Nothing too evil for small modules — the module gets higher complexity than each of its constituents, but lower than their sum. In practice, multiple unrelated functions are often gathered in a ‘utils’ or ‘tools’ file or directory to alleviate operational complexity.

Development and Operational Complexity

What we discussed above is structural or development complexity — the number of concepts and rules inside a bounded context. However, we also need to understand operations and components of the system as a whole, leading to operational or integration complexity:

  • Does this new requirement fit into an existing module or does it call for a dedicated one?
  • Which libraries with known security vulnerabilities do we use?
  • Is there any way to cut our cloud services cost?
  • 1% of requests time out. Would you please investigate that?
  • My team needs to implement this and that. Do we have something fit for reuse?
  • What the **** is that global variable about?
  • Do we really need this code in production?
  • I need to change the behavior of that shared component a little bit. Any objections?

When there are hundreds or thousands of modules deployed nobody knows the answers. That’s similar to the case of one needing to do something under Linux: hundreds of tools are pre-installed and thousands more are available as packages, but the only real way forward is first googling for your needs, then trying two or three recipes from the search results to see which one fits your setup. Unfortunately, Google does not index your company’s code.

Composition of Modules

A module may encapsulate not only individual concepts, but also other modules. That is not surprising as an OOP class is a kind of module — it has public methods and private members as well. Hiding a module inside another one removes it from the global scope, decreasing the operational complexity of the system — now it is not the system’s architect but the maintainer of the outer module who must remember about the inner module. On one hand, that builds a manageable hierarchy in both the organization and the code. On the other hand, code reuse and many optimizations become nearly impossible as internal modules are hardly known organization-wide:

Figure 6: Composition of modules prevents reuse.

If the functionality of our internal module is needed by our clients, we have two bad options to choose from:

Forwarding and Duplication

Figure 7: Forwarding the API of an internal module.

We can add the API of a module we encapsulate to our public API and forward its calls to the internal module. However, that increases the complexity and lowers the cohesion of our module — now each client of our module is also exposed to the details of the methods of the module we have encapsulated even if they are not interested in using it.

Figure 8: Duplicating an internal module.

Another bad option is to let the clients that need a module that we encapsulate duplicate it and own the copies as their own submodules. This relieves us of any shared responsibility, lets us modify and misuse our internals in any way we like, but violates a couple of rules of common sense.

Both approaches, namely keeping all the modules in the global scope and encapsulating utility modules through composition, found their place in history [FSA]. Service-Oriented Architecture was based on the idea of reuse but fell prey to the complexity of its Enterprise Service Bus which had to account for all the interactions (API methods) in the system. In reaction, the Microservices approach turned the tide in the opposite direction: its proponents disallowed sharing any resources or code between services to enforce their decoupling.

References

[DDD] Domain-Driven Design: Tackling Complexity in the Heart of Software. Eric Evans. Addison-Wesley (2003).

[FSA] Fundamentals of Software Architecture: An Engineering Approach. Mark Richards and Neal Ford. O’Reilly Media, Inc. (2020).

This is a chapter from a book I am writing. It is loosely based on A Philosophy of Software Design by John Ousterhout and my article. Any criticism is warmly welcome!

--

--

yet another unemployed burnt-out experienced embedded C++ technical lead from Ukraine