It is a living document, last update: October 2024. Your contributions are welcome!
There are so many buzzwords and best practices out there, but let's focus on something more fundamental. What matters is the amount of confusion developers feel when going through the code.
Confusion costs time and money. Confusion is caused by high cognitive load. It's not some fancy abstract concept, but rather a fundamental human constraint.
Since we spend far more time reading and understanding code than writing it, we should constantly ask ourselves whether we are embedding excessive cognitive load into our code.
Cognitive load is how much a developer needs to think in order to complete a task.
When reading code, you put things like values of variables, control flow logic and call sequences into your head. The average person can hold roughly four such chunks in working memory. Once the cognitive load reaches this threshold, it becomes much harder to understand things.
Let's say we have been asked to make some fixes to a completely unfamiliar project. We were told that a really smart developer had contributed to it. Lots of cool architectures, fancy libraries and trendy technologies were used. In other words, the author had created a high cognitive load for us.
We should reduce the cognitive load in our projects as much as possible.
Intrinsic - caused by the inherent difficulty of a task. It can't be reduced, it's at the very heart of software development.
Extraneous - created by the way the information is presented. Caused by factors not directly relevant to the task, such as smart author's quirks. Can be greatly reduced. We will focus on this type of cognitive load.
Let's jump straight to the concrete practical examples of extraneous cognitive load.
We will refer to the level cognitive load as follows:
🧠
: fresh working memory, zero cognitive load
🧠++
: two facts in our working memory, cognitive load increased
🤯
: working memory overflow, more than 4 facts
Our brain is much more complex and unexplored, but we can go with this simplistic model.
if val > someConstant // 🧠+
&& (condition2 || condition3) // 🧠+++, prev cond should be true, one of c2 or c3 has be true
&& (condition4 && !condition5) { // 🤯, we are messed up by this point
...
}
Introduce intermediate variables with meaningful names:
isValid = val > someConstant
isAllowed = condition2 || condition3
isSecure = condition4 && !condition5
// 🧠, we don't need to remember the conditions, there are descriptive variables
if isValid && isAllowed && isSecure {
...
}
if isValid { // 🧠+, okay nested code applies to valid input only
if isSecure { // 🧠++, we do stuff for valid and secure input only
stuff // 🧠+++
}
}
Compare it with the early returns:
if !isValid
return
if !isSecure
return
// 🧠, we don't really care about earlier returns, if we are here then all good
stuff // 🧠+
We can focus on the happy path only, thus freeing our working memory from all sorts of preconditions.
We are asked to change a few things for our admin users: 🧠
AdminController extends UserController extends GuestController extends BaseController
Ohh, part of the functionality is in BaseController
, let's have a look: 🧠+
Basic role mechanics got introduced in GuestController
: 🧠++
Things got partially altered in UserController
: 🧠+++
Finally we are here, AdminController
, let's code stuff! 🧠++++
Oh, wait, there's SuperuserController
which extends AdminController
. By modifying AdminController
we can break things in the inherited class, so let's dive in SuperuserController
first: 🤯
Prefer composition over inheritance. We won't go into detail - there's plenty of material out there.
Method, class and module are interchangeable in this context
Mantras like "methods should be shorter than 15 lines of code" or "classes should be small" turned out to be somewhat wrong.
Deep module - simple interface, complex functionality
Shallow module - interface is relatively complex to the small functionality it provides
Having too many shallow modules can make it difficult understand the project. Not only do we have to keep in mind each module responsibilities, but also all their interactions. To understand the purpose of a shallow module, we first need to look at the functionality of all the related modules. 🤯
Information hiding is paramount, and we don't hide as much complexity in shallow modules.
I have two pet projects, both of them are somewhat 5K lines of code. The first one has 80 shallow classes, whereas the second one has only 7 deep classes. I haven't been maintaining any of these projects for one year and a half.
Once I came back, I realised that it was extremely difficult to untangle all the interactions between those 80 classes in the first project. I would have to rebuild an enormous amount of cognitive load before I could start coding. On the other hand, I was able to grasp the second project quickly, because it had only a few deep classes with a simple interface.
The best components are those that provide powerful functionality yet have simple interface.
John K. Ousterhout
The interface of the UNIX I/O is very simple. It has only five basic calls:
open(path, flags, permissions)
read(fd, buffer, count)
write(fd, buffer, count)
lseek(fd, offset, referencePosition)
close(fd)