1OPS - One Operation Per Statement
The 1OPS principle is the single principle that had the biggest impact in my entire coding career. I gave some hints about it in a previous post, but I was suggested that it might warrant a post on its own to go more in depth.
Now, according to my Immutability Rules, I will update the previous one with a pointer to this one.
Principle
The basic idea is very simple :
Only One Operation per Statement
Rationale for the rule
The rationale is very simple and based on the fact that :
Naming things is what helps understanding.
A statement is the building block of code. It is a cohesive entity. It is
usually in the form of dst = statement(src_1, ... , src_N)
1. The
statement()
being possibly very complex. As soon as you combine multiple
operations in that function, the compiler will create automatic anonymous
temporary variables.
That automatic creation mechanism is at fault for quick understanding : when you read a statement, you need to keep all the automatic anonymous temporary variables in your mind while trying to understand the code. Therefore avoiding them is very helpful.
The 1OPS idea is the same idea that says functions shall do only one single thing and have a meaningful name.
In the same vein, anonymous functions are also discouraged, as I’m predicting some downturn in lambda functions. They do have their use, but right now they are still too hyped and we don’t have enough feedback via painful maintenance. Which is the whole point of this article: unlocking very simple maintenance. I do not focus on any short term gains that modern syntax sugar gives me. Instead I really focus on long term readability gains that this rule gives me.
The rule explained
In order of importance:
- A single operation per statement
- Explicit temporary variable names
- Scope change is an operation
- Unit/encoding change is an operation
Single operation per statement
Every statement does consists of only 1 operation. It can have as many parameters as needed. But it needs to be a single operation. For example, an addition with 2 parameters, and AND logic operation with 3 parameters or a OR bitwise operation with 5 parameters. We can have multiple of the same.
good_example = var1 + var2;
good_example = var1 && var2 && var3;
good_example = MASK_1 | MASK_2 | MASK_3 | MASK_4 | MASK_5;
bad_example = var1 + var2 - var3;
bad_example = MASK_1 | MASK_2 & MASK_5;
If you need to think to remember the operation order, it is too complex. As one operation type always executes from left to right2.
Explicit names for every operation outcomes
Every operation is assigned to a temporary variable that has an explicit and meaningful name. This enables to follow what is done very nicely as it reduce the content that a reader needs to keep in mind.
uint32_t bswap(uint32_t i) {
return (i >> 24) | ((i >> 8) & 0x0000FF00) | ((i << 8) & 0x00FF0000) | (i << 24);
}
uint32_t bswap_1ops(uint32_t i) {
uint8_t i3 = i >> 24;
uint8_t i2 = i >> 16;
uint8_t i1 = i >> 8;
uint8_t i0 = i >> 0;
uint32_t r0 = i3 << 0;
uint32_t r1 = i2 << 8;
uint32_t r2 = i1 << 16;
uint32_t r3 = i0 << 24;
uint32_t r = r0 | r1 | r2 | r3;
return r;
}
Both gets complied to the same binary code, but the 2nd one is much more easy to read for a newcomer. And it conveys nicely the type of intermediate variables.
bswap:
mov eax, edi
bswap eax
ret
bswap_1ops:
mov eax, edi
bswap eax
ret
Scope is Meaningful
Assignments to temporary variables don’t count towards the 1OPS, but assignments towards ones with a different scope do. Ideally the variable names are different.
int player_x;
int player_y
void update(int dx, int dy) {
int new_player_x = player_x + dx;
int new_player_y = player_y + dy;
int clamped_player_x = middle(MIN_X, new_player_x, MAX_X);
int clamped_player_y = middle(MIN_Y, new_player_y, MAX_Y);
player_x = clamped_player_x;
player_y = clamped_player_y;
}
Here player_x
is the global variable. And the algorithm is very simple to understand :
- We compute the new value
- then clamp it
- and finally update the global variable.
This enables to have almost atomic changes at the end of the function. Every temporary variable is locally scoped to the function. Scope changes are in only one place : in the beginning for reading from global, and and the end when writing to global.
A function can then evolve from pure function that has no side effects to a mutable function that changes global state variables. Note that changing a member of a struct is also a scope change such as:
struct player {
int x, y;
}
void update(struct player *p, int dx, int dy) {
p->x += dx;
p->y += dy;
}
void update_with_clamping(struct player *p, int dx, int dy) {
int new_player_x = p->x + dx;
int new_player_y = p->y + dy;
int clamped_player_x = middle(MIN_X, new_player_x, MAX_X);
int clamped_player_y = middle(MIN_Y, new_player_y, MAX_Y);
p->x = clamped_player_x;
p->y = clamped_player_y;
}
Here update()
doesn’t need to have all those temporaries, as it is very
simple. Whereas update_with_clamping()
does have some logic and therefore
needs named temporaries.
Going further is to avoid using pointers for simple structures. Which makes them immutable in terms of the API. It works because the compiler does optimize the extra memory copies out when it makes sense (usually upon inlining).
struct player {
int x, y;
}
struct player update_with_clamping(struct player p, int dx, int dy) {
int new_player_x = p.x + dx;
int new_player_y = p.y + dy;
int clamped_player_x = middle(MIN_X, new_player_x, MAX_X);
int clamped_player_y = middle(MIN_Y, new_player_y, MAX_Y);
struct player new_player = p;
new_player.x = clamped_player_x;
new_player.y= clamped_player_y;
return new_player;
}
void main(...) {
struct player p;
// ...
p = update_with_clamping(p, dx, dy);
// ...
}
The trick is to initialize the whole struct with the previous version struct
player new_player = p;
which enables to add fields and have them handled
everywhere.
The compiler is effective to optimize, but is often limited by compilation
units. So this trick doesn’t work well if multiple .c
are used. Unless
-flto
is leveraged.
Units & Encoding are Meaningful
For every quantity that has a unit, always suffix with that unit. For every variable that has an encoding, always suffix with that encoding.
The following code looks now obviously wrong
double delay_ms = 1000;
double frequency_hz = 1 / delay_ms;
It should be fixed as
double delay_ms = 1000;
double frequency_hz = 1000 / delay_ms;
It also works with encoding.
my $value = 1000;
my $value_hex = sprintf "%x", $value;
Some History - Name change from 1SLOC to 1OPS
I got some remarks that we can span to multiple lines therefore enabling multiple operations. This is actually discouraged as we saw. Therefore, I renamed the rule to 1OPS.