parts/6.24.Side-effectsAndOrderOfEvaluationOfOperands-SAM.md

6.24 Side-effects and Order of Evaluation of Operands [SAM]

6.24.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.24 exists in C++.

The evaluation of an expression includes: (i) its value computation; and (ii) its side-effects. The value computation is the value returned by the expression, e.g., the valuation of 3 * 2 + 1 is 7. The side-effect of an expression are

For example consider:

int i = 2;
int j = i++;

the evaluation of i++ is 2 and the side-effects are the writing of 3 to i and the initialization of j.

Within an expression, one must ensure an object is stored only once to avoid undefined behaviour [EWF], e.g.,

i = i++ + 5; // undefined behaviour (before C++17)

or

k = i++ + i\--; // undefined behaviour in all versions of C++

and expressions modifying objects can only read the object to determine the value to be stored (e.g., ++i requires reading the value), i.e., other accesses are undefined behaviour, e.g.,

my_array\[i\] = i++; // undefined behaviour (before C++17)

Starting with C++17, the evaluation order of an expression involving
overloaded operators preserves the sequenced before behaviour of the
built-in operator:

```{.cpp}
my_array[i] = i++;
my_array[i++] = i++;

say i = 10 before the expression

evaluate RHS i++i is 11

evaluate my_array[i++] //evaluates my_array\[11\], then assigns i to 12

my_array[11] is assigned 10

This occurs because assignment is sequenced after the value computation of the right and left operands and before the value computation of the assignment expression and, the right operand is sequenced before the left operand. [C++17, Clause 8.18 [expr.ass], para. 1] Since this is the built-in operator, this statement can be thought of as:

Compute value of right-hand-side: i++ (e.g., integer value).

Compute value of left-hand-side: my_array[i] (e.g., memory address).

Apply side-effects of i++.

Apply side-effects of the assignment.

In general, one should follow commonly-stated C/C++ advice of never reading from and writing to the same object within an expression to avoid potential vulnerabilities. Often breaking the expression into separate statements achieves clear and clean semantics, e.g.,

++i;
my_array[i] = i;

or

my_array[i] = i;
++i;

makes it unambiguous what the value of i is during the array assignment and eliminates the possibility of vulnerabilities.

In addition, it is important to note that overloading an operator disables short-circuiting behaviours (e.g., built-in boolean operators): those operators' operands are all evaluated before the operator itself. Similarly, overloading the comma operator disable the guaranteed order of evaluation.

The C++ built-in (two-argument) Boolean operators (e.g., && and \|\|) are short-circuiting, i.e., if the value of an earlier (from left-to-right) operand of an operation determines the result of the operation, then all remaining arguments are not evaluated.

Typically this allows one to write code like this, e.g.,

  int *p;
  // ...
  if (p != nullptr && *p != 0) {
    /* do something */
  }

i.e., if p is nullptr, then *p != 0 is never executed, thus, avoiding undefined behaviour. Only when p is not nullptr is *p != 0 is evaluated. It must be stressed that this only applies to the built-in && and || operators: user-defined operator overloads as functions always evaluate all operands first.

Consequently should one want to always evaluate all operands of a boolean expression, one should not write code like this:

bool x = foo() && bar();

where foo() and bar() are functions that return something convertible to bool. In this expression, if foo() returns false, then bar() will never be executed; –only when foo() returns true will bar() be executed. Similarly for ||:

bool y = foo() || bar();

i.e., only when foo() returns false will bar() be executed if foo() returns true then bar() will never be executed. Thus, if both foo() and bar() are both required to be executed, then execute them in separate statements first, e.g.,

bool foo_result = foo();
bool bar_result = bar();
bool x = foo_result && bar_result;
bool y = foo_result \|\| bar_result;

The built-in comma operator evaluates its expressions left to right and sequences the left expression before the right. This is not the case for an overloaded comma operator, which follows the rule for a function call.

The order of evaluation of function arguments in C++ is unspecified. Therefore, a side-effect in one argument position can change the result of a different argument, for example:

int i = 0;
int get() {return i;}
void foo(int A, int B);
void bar () {  
  foo (get(), ++i);  
}

In the above example, inside foo, the value of the A can be 0 or 1 depending upon which argument is evaluated first. This can be avoided if the call to get or the increment of i is made before the call to foo, i.e. forcing the order of evaluation.

6.24.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can: