The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.25 exists in C++.
C++ has several instances of operators which are similar in structure, but different in meaning. Examples of operators in C-based languages that can cause confusion are:
==
and =
;&&
and &
;||
and |
;<
, <<
, and <<=
; and>
, >>
, and >>=
.The typographical similarity can lead to code like the following, where it is unclear if the expression as spelled is actually intended, or if the author has typos in it, meaning a different operator instead:
auto f(unsigned i, unsigned j)
{return (i > 1) & (j = 1); // (>>, &&, ==)?
}
The following code in a production phone OS caused the “bricking” of many users phones:
if (key_data_.has_value() & !key_data_->label().empty())
instead of
if (key_data_.has_value() && !key_data_->label().empty())
or the even clearer using the alternative operator representation and
for &&
if (key_data_.has_value() and !key_data_->label().empty())
As a general rule, the use of =
, +=
, -=
in an expression when the operator is not the final assignment to a variable is unsafe since the assignment operator creates side-effects within the expression which are difficult to analyze by a human reader and can be have different results depending upon the order of evaluation of terms within the expression.
But even in assignment expression flipping the assignment symbol with the operator can itself lead to valid code that was not intended:
int i{42};
22; // i becomes 64
i += 22; // i becomes 22
i =+ 22; // i becomes -22 i =-
C++ provides significant freedom in constructing statements. This freedom, if misused, can result in unexpected results and potential vulnerabilities.
Since the order of evaluation within expressions is only partially defined, sub-expressions with side effects on variables used within the overall expression can result in undefined behaviour [EWF].
The flexibility of C++ can obscure the intent of a programmer. Consider:
int x,y;
/* ... */
if (x = y){
/* ... */
}
A fair amount of analysis may need to be done to determine whether the programmer intended to do an assignment as part of the if
statement (valid in C++) or whether the programmer made the common mistake of using an =
(assignment) instead of a ==
(equality).
This confusion can be corrected by moving assignments outside of Boolean contexts. This would change the example code to:
int x,y;
/* … */
x = y;if (x == 0) {
/* ... */
}
This would clearly state what the programmer meant and that the assignment of y to x was intended.
Additional confusion occurs in the use of the logical &&
or ||
operators and the bitwise &
or |
operators. The compiler will implicitly convert arithmetic expressions to bool
for operands of the logical operators. Similarly, operands of bool
type will be promoted to int
for operands of the bitwise operators (see Conversion Errors [FLC]).
It may not be clear whether the programmer intended to use the logical operator &&
or bitwise operator &
instead:
unsigned f(unsigned i, unsigned j)
{return (i > 0) & j;
}
Using the alternative tokens and
/ or
in lieu of &&
and ||
reduces the possibility of confusion. Similarly, a not_eq b
is preferable to a != b
since the latter is easily confused with the equally valid expression a |= b
.
Programmers can easily get in the habit of inserting the ;
statement terminator at the end of statements. However, inadvertently doing this can drastically alter the meaning of code, even though the code is valid as in the following example:
int a,b;
/* … */
if (a == b); // the semi-colon will make the following code always execute
{ /* ... */
}
Because of the misplaced semi-colon, the code block following the if will always be executed. In this case, it is extremely likely that the programmer did not intend to put the semi-colon there.
Unary ‘+
’{.cpp} on a variable is (almost) a no-op, and is possibly a mistype of ‘++
’{.cpp}. A unary ‘-
’{.cpp} on a variable will switch its sign, unless applied to a variable of an unsigned type, in which case the result is the value subtracted from 2^n where n is the number of bits in the unsigned type.
The language does not impose any restrictions on semantics of overloaded operators. This can cause (potentially generic) code to behave in completely unobvious ways, when such types with “unusual” operator semantics are used.
For example, the boost.spirit library allows code like the following to create parser rules:
',') >> real_p); // rule that accepts a comma-separated list of real numbers r = real_p >> *(ch_p(
This library uses C++ operator overloads to create an embedded domain-specific language for grammar rules, allowing the specification of parser rules as C++ expressions.
When overloaded, related operators like the compound assignment with their base operator are not longer guaranteed to keep their behavioral relationship that they have for built-in types. For example, a += b
is not guaranteed to behave like a = a + b
, or being defined at all.
Similarly for overloaded relational operators, for a == b
, there is no guarantee that a != b
is equivalent to !(a == b)
if both are overloaded by the user.
Unless all relational operators for a type are defined either explicitly in a consistent way or implicitly, unexpected results can occur. A user-declared three-way comparison operator (<=>
) is used by the compiler to synthesize the relational operators consistently. If operator<=>
is defined as =default
, the equality comparison operators will also be defined; and if operator==
with return type bool
is defined, a corresponding inequality operator!=
is also defined implicitly.
To avoid the vulnerability or mitigate its ill effects, C++ software developers can:
— Use the avoidance mechanisms of ISO/IEC 24772-1 clause 6.25.5.
Simplify expressions to aid in code readability and help future maintainers understand the intent and nuances of the code. For example,
Avoid assignments embedded within other statements and expressions.
Spell unary operators (e.g., -
) with a leading blank in expressions to avoid them being misread as combined operators.
Avoid the use of unary plus, since it is almost always a no-op for built-in types.
Avoid Boolean operators (&&
, ||
, !
) with non-bool
operands, e.g., operands of numeric types.
Avoid bit operators (&
, |
, ~
, <<
, >>
) with anything except operands of non-bool
unsigned types.
Consider using alternative tokens for the logical operators, such as and
, or
, and not
, and comparison operators such as equal
and not_eq
.
If your code structure requires an empty statement ;
use an empty code block instead {}
.
Prohibit conflicting side-effects in sub-expressions.
Avoid defining semantics of overloaded operators to deviate from the semantics of these operators for the built-in types.
Prefer defaulted and synthesized comparison operators over individual overloads to ensure that all of the related comparison operators behave consistently.