parts/6.24.Side-effectsAndOrderOfEvaluationOfOperands-SAM.md

6.24 Side-effects and Order of Evaluation of Operands [SAM]

6.24.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.24 exists in C++.

The evaluation of an expression includes: (i) its value computation; and (ii) its side-effects. The value computation is the value returned by the expression, e.g., the valuation of 3 * 2 + 1 is 7. The side-effect of an expression are

For example consider:

int i = 2;
int j = i++;

the evaluation of i++ is 2 and the side-effects are the writing of 3 to i and the initialization of j.

Within an expression, one must ensure an object is stored only once to avoid undefined behaviour, e.g.,

i = i++ + 5; // undefined behaviour (before C++17)

or

k = i++ + i\--; // undefined behaviour in all versions of C++

and expressions modifying objects can only read the object to determine the value to be stored (e.g., ++i requires reading the value), i.e., other accesses are undefined behaviour, e.g.,

my_array\[i\] = i++; // undefined behaviour (before C++17)

Starting with C++17, the evaluation order of an expression involving
overloaded operators preserves the sequenced before behaviour of the
built-in operator:

```{.cpp}
my_array[i] = i++;
my_array[i++] = i++;

say i = 10 before the expression

evaluate RHS i++i is 11

evaluate my_array[i++] //evaluates my_array\[11\], then assigns i to 12

my_array[11] is assigned 10

This occurs because assignment is sequenced after the value computation of the right and left operands and before the value computation of the assignment expression and, the right operand is sequenced before the left operand. [C++17, Clause 8.18 [expr.ass], para. 1] Since this is the built-in operator, this statement can be thought of as:

Compute value of right-hand-side: i++ (e.g., integer value).

Compute value of left-hand-side: my_array[i] (e.g., memory address).

Apply side-effects of i++.

Apply side-effects of the assignment.

In general, one should follow commonly-stated C/C++ advice of never reading from and writing to the same object within an expression to avoid potential vulnerabilities. Often breaking the expression into separate statements achieves clear and clean semantics, e.g.,

++i;
my_array[i] = i;

or

my_array[i] = i;
++i;

makes it unambiguous what the value of i is during the array assignment and eliminates the possibility of vulnerabilities.

In addition, it is important to note that overloading an operator disables short-circuiting behaviours (e.g., built-in boolean operators): those operators' operands are all evaluated before the operator itself.

The C++ built-in (two-argument) Boolean operators (e.g., && and \|\|)as well as <type_traits>’s std::conjunction and std::disjunction operations are all short-circuiting, i.e., if the value of an earlier (from left-to-right) operand of an operation determines the result of the operation, then all remaining arguments are not evaluated.

<!--
Conjunction and disjunction operate at compile time and the short-circuiting is about template 
instantiations that might lead to compile errors otherwiese. This is not a runtime safety issue. I 
suggest dropping that (Peter)_
-->

Typically this allows one to write code like this, e.g.,

  int *p;
  // ...
  if (p != nullptr && *p != 0) {
    /* do something */
  }

i.e., if p is nullptr, then *p != 0 is never executed, thus, avoiding undefined behaviour. Only when p is not nullptr is *p != 0 is evaluated. It must be stressed that this only applies to the built-in && and || operators: user-defined operator overloads as functions always evaluate all operands first.

Consequently should one want to always evaluate all operands of a boolean expression, one should not write code like this:

bool x = foo() && bar();

where foo() and bar() are functions that return something convertible to bool. In this expression, if foo() returns false, then bar() will never be executed; –only when foo() returns true will bar() be executed. Similarly for ||:

bool y = foo() || bar();

i.e., only when foo() returns false will bar() be executed if foo() returns true then bar() will never be executed. Thus, if both foo() and bar() are both required to be executed, then execute them in separate statements first, e.g.,

bool foo_result = foo();
bool bar_result = bar();
bool x = foo_result && bar_result;
bool y = foo_result \|\| bar_result;
<!--
Stephen: My write-up here is lengthy but should help get more terse
wording\... but I note this: C++ operator information is in C++17 Clause
8 and Clause 16.5, \... Also per 16.5.1 para 2. unary and binary forms
of the same operator are considered to have the same name so one can
hide another from an enclosing scope. Thus, this is also another
possible vulnerability.\]
-->

6.24.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can: