ISO/IEC JTC 1/SC 22/WG23 N1012

Date: 2020-11-23

ISO/IEC TR 24772–10
Notes on this document

Effective 23 November 2020, this document is being moved to “github”. Contact Stephen.michell@maurya.on.ca to gain access.

This document is a draft of a Guidance to avoiding programming language vulnerabilities in C++.

At this point in time, the following clauses are essentially completed first pass.

TBD

Participants at meeting 23 November 2020

Stephen Michell

Paul Preney

Peter Sommerlad

Richard Corden

Erhard Ploedereder

Clive Pygott

Michael Wong

Edition 1

ISO/IEC JTC 1/SC 22/WG 23

Secretariat: ANSI

Information Technology — Programming languages — Guidance to avoiding vulnerabilities in programming languages – Part 10 – Vulnerability descriptions for the programming language C++

Document type: International standard

Document subtype: if applicable

Document stage: (10) development stage

Document language: E

Élément introductif — Élément principal — Partie n: Titre de la partie

Warning

This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard.

Recipients of this draft are invited to submit, with their comments, notification of any relevant patent rights of which they are aware and to provide supporting documentation.

Copyright notice

This ISO document is a working draft or committee draft and is copyright-protected by ISO. While the reproduction of working drafts or committee drafts in any form for use by participants in the ISO standards development process is permitted without prior permission from ISO, neither this document nor any extract from it may be reproduced, stored or transmitted in any form for any other purpose without prior written permission from ISO.

Requests for permission to reproduce this document for the purpose of selling it should be addressed as shown below or to ISO’s member body in the country of the requester:

ISO copyright office

Case postale 56, CH-1211 Geneva 20

Tel. + 41 22 749 01 11

Fax + 41 22 749 09 47

E-mail copyright@iso.org

Web www.iso.org

Reproduction for sales purposes may be subject to royalty payments or a licensing agreement.

Violators may be prosecuted.

Contents Page

Foreword vii

Introduction viii

1. Scope 1

2. Normative references 1

3. Terms and definitions, symbols and conventions 1

3.1 Terms and definitions 1

4. Language concepts 4

5. Avoiding programming language vulnerabilities in C++ 4

6. Specific Guidance for C++ Vulnerabilities 6

6.1 General 6

6.2 Type System [IHN] 6

6.3 Bit Representations [STR] 7

6.4 Floating-point Arithmetic [PLF] 8

6.5 Enumerator Issues [CCB] 8

6.6 Conversion Errors [FLC] 9

6.7 String Termination [CJM] 10

Use std::string or similar, in preference to C-style arrays of chars 11

6.8 Buffer Boundary Violation [HCB] 11

6.9 Unchecked Array Indexing [XYZ] 12

6.10 Unchecked Array Copying [XYW] 13

6.11 Pointer Type Conversions [HFC] 13

6.12 Pointer Arithmetic [RVG] 15

6.13 NULL Pointer Dereference [XYH] 16

6.14 Dangling Reference to Heap [XYK] 16

6.15 Arithmetic Wrap-around Error [FIF] 18

6.16 Using Shift Operations for Multiplication and Division [PIK] 18

6.17 Choice of Clear Names [NAI] 19

6.18 Dead Store [WXQ] 19

6.19 Unused Variable [YZS] 20

6.20 Identifier Name Reuse [YOW] 20

6.21 Namespace Issues [BJL] 21

6.22 Initialization of Variables [LAV] 21

6.23 Operator Precedence and Associativity [JCW] 21

6.24 Side-effects and Order of Evaluation of Operands [SAM] 21

6.25 Likely Incorrect Expression [KOA] 22

6.26 Dead and Deactivated Code [XYQ] 24

6.27 Switch Statements and Static Analysis [CLL] 24

6.28 Demarcation of Control Flow [EOJ] 25

6.29 Loop Control Variables [TEX] 26

6.30 Off-by-one Error [XZH] 27

6.31 Structured Programming [EWD] 28

6.32 Passing Parameters and Return Values [CSJ] 28

6.33 Dangling References to Stack Frames [DCM] 29

6.34 Subprogram Signature Mismatch [OTR] 30

6.35 Recursion [GDL] 31

6.36 Ignored Error Status and Unhandled Exceptions [OYB] 31

6.37 Type-breaking Reinterpretation of Data [AMV] 32

6.38 Deep vs. Shallow Copying [YAN] 33

6.39 Memory Leak and Heap Fragmentation [XYL] 33

6.40 Templates and Generics [SYM] 34

6.41 Inheritance [RIP] 35

6.41.1 Applicability to language 35

6.41.2 Guidance to language users 37

6.42 Violations of the Liskov Substitution Principle or the Contract Model [BLP] 37

6.42.1 Applicability to language 37

6.42.2 Guidance to language users 38

6.43 Redispatching [PPH] 38

6.43.1 Applicability to language 38

6.43.2 Guidance to language users 39

6.44 Polymorphic variables [BKK] 39

6.44.1 Applicability to language 40

6.44.2 Guidance to language users 41

6.45 Extra Intrinsics [LRM] 42

6.46 Argument Passing to Library Functions [TRJ] 42

6.47 Inter-language Calling [DJS] 42

6.48 Dynamically-linked Code and Self-modifying Code [NYY] 45

6.49 Library Signature [NSQ] 46

6.50 Unanticipated Exceptions from Library Routines [HJW] 47

6.51 Pre-processor Directives [NMP] 48

6.52 Suppression of Language-defined Run-time Checking [MXB] 49

6.53 Provision of Inherently Unsafe Operations [SKL] 49

6.54 Obscure Language Features [BRS] 49

6.55 Unspecified Behaviour [BQF] 50

6.56 Undefined Behaviour [EWF] 50

6.57 Implementation–defined Behaviour [FAB] 51

6.58 Deprecated Language Features [MEM] 52

6.59 Concurrency – Activation [CGA] 52

6.60 Concurrency – Directed termination [CGT] 52

6.60.1 Applicability to language 53

6.60.2 Guidance to language users 53

6.61 Concurrent Data Access [CGX] 53

6.62 Concurrency – Premature Termination [CGS] 53

6.63 Protocol Lock Errors [CGM] 53

6.64 Uncontrolled Format String [SHL] 54

7. Language specific vulnerabilities for C 54

8. Implications for standardization 54

Bibliography 57

Index 60

Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote.

In exceptional circumstances, when the joint technical committee has collected data of a different kind from that which is normally published as an International Standard (“state of the art”, for example), it may decide to publish a Technical Report. A Technical Report is entirely informative in nature and shall be subject to review every five years in the same manner as an International Standard.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.

ISO/IEC TR 24772-10, was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 22, Programming languages, their environments and system software interfaces.

Introduction

This Technical Report provides guidance for the programming language C++, so that application developers using or considering C++ will be better able to avoid the programming constructs that lead to vulnerabilities in software written in the C++ language and their attendant consequences. This guidance can also be used by developers to select source code evaluation tools that can discover and eliminate some constructs that could lead to vulnerabilities in their software. This report can also be used in comparison with companion Technical Reports and with the language-independent report, TR 24772–1, to select a programming language that provides the appropriate level of confidence that anticipated problems can be avoided.

This technical report part is intended to be used with TR 24772–1, which discusses programming language vulnerabilities in a language independent fashion. It is also intended to be used with TR 24772-3, which discusses how the vulnerabilities introduced in TR 24772-1 are manifested in C, which is a subset of C++.

It should be noted that this Technical Report is inherently incomplete. It is not possible to provide a complete list of programming language vulnerabilities because new weaknesses are discovered continually. Any such report can only describe those that have been found, characterized, and determined to have sufficient probability and consequence.

Information Technology — Programming Languages — Guidance to avoiding vulnerabilities in programming languages — Vulnerability descriptions for the programming language C++

1. Scope

This Technical Report specifies software programming language vulnerabilities to be avoided in the development of systems where assured behaviour is required for security, safety, mission-critical and business-critical software. In general, this guidance is applicable to the software developed, reviewed, or maintained for any application.

Vulnerabilities described in this Technical Report document the way that the vulnerability described in the language-independent TR 24772–1 are manifested in C++.

2. Normative references

The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

ISO/IEC 14882:2014 — Programming Languages—C ++

ISO/IEC TR24772–3 -- Information Technology — Programming Languages — Guidance to avoiding vulnerabilities in programming languages — Vulnerability descriptions for the programming language C

3. Terms and definitions, symbols and conventions

3.1 Terms and definitions

For the purposes of this document, the terms and definitions given in ISO/IEC 2382, in TR 24772–1, in 14882:2014 and the following apply. Other terms are defined where they appear in italic type.

The following terms are in alphabetical order, with general topics referencing the relevant specific terms.

3.1.1

TBD

3.1.2

access:

An execution-time action, to read or modify the value of an object.

Note 1: Where only one of two actions is meant, read or modify. Modify includes the case where the new value being stored is the same as the previous value. Expressions that are not evaluated do not access objects

3.1.3

access protection

ADL

argument dependent lookup

alignment
requirement that objects of a particular type be located on storage boundaries with addresses that are particular multiples of a byte address

3.1.3

argument
the expression in the comma-separated list bounded by the parentheses in a function call expression, or a sequence of preprocessing tokens in the comma-separated list bounded by the parentheses in a function-like macro invocation

Note 1: Also called actual argument

Note 2: An argument replaces a formal parameter as the call is realized.

3.??

argument dependent lookup

lookup that finds additional overloads from the namespaces of the types of the arguments used in unqualified function calls

3.1.4

behaviour
an external appearance or action [recommend removal - standard comp sci term]

3.1.5

bit
the unit of data storage in the execution environment large enough to hold an object that may have one of two values

Note: It need not be possible to express the address of each individual bit of an object [recommend removal - standard comp sci term]

3.1.6

byte
the addressable unit of data storage large enough to hold any member of the basic character set of the execution environment

Note: It is possible to express the address of each individual byte of an object uniquely. A byte is composed of a contiguous sequence of bits, the number of which is implementation-defined. The least significant bit is called the low-order bit; the most significant bit is called the high-order bit. [recommend removal - standard comp sci term]

3.1.7

character
abstract member of a set of elements used for the organization, control, or representation of data and ideally when treated sequentially represents text

correctly rounded result
representation in the result format that is nearest in value, subject to the current rounding mode, to what the result would be given unlimited range and precision

3.1.8

class
a user-defined type declared with the class-key ‘class’ or ‘struct’ {.ul}

3.1.9

concrete

Recommend deletion, only used in 6.40 Templates and Generics

3.1.10

diagnostic message

informational message that is either an error or warning about an issue detected by the implementation

[3.1.11]{.ul}**

dynamic dispatch

[recommend removal] {.ul}

3.1.12

encapsulation

recommend removal - all usages in the document satisfy standard English use of encapsulate

3.1.13

formal parameter

object declared as part of a function declaration or definition that acquires a value on entry to the function, or an identifier from the comma-separated list bounded by the parentheses immediately following the macro name in a function-like macro definition.

[3.1.xx] friend{.ul}

function or class that can access the private and protected members of a specific class

[3.1.xx] hidden friend{.ul}

friend function that is only declared within a class or class template definition and hence is only found by ADL

3.1.xx

Implementation

toolchain that is used to build and support the execution of the C++ program

3.1.15

implementation-defined behaviour

behaviour, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents

3.1.16
implementation-defined value

unspecified value where each implementation documents how the choice for the value is selected. [recommend removal - term not use other than note(s)]

3.1.17

implementation limit

restriction imposed upon programs by the implementation.

3.1.18

indeterminate value

unspecified value or a trap representation

3.1.19
indeterminately sequenced

sequenced in a way that one of two evaluations will be executed before the other but in an unspecified order

3.1.20

Inheritance

TBD [recommend removal - well understood in OO languages]

3.1.21

language type

see block-structured language, comb-structured language (Non-responsive) [recommended removal - inadequate relevancy]

3.1.21

locale-specific behaviour

behaviour that depends on local conventions of nationality, culture, and language that each implementation documents

3.1.22

memory location

an object of scalar type or a maximal sequence of adjacent bit-fields all having nonzero width [recommend removal and move to 6.3]

3.23
multibyte character

sequence of one or more bytes representing a member of the extended character set of either the source or the execution environment.

Note: The extended character set is a superset of the basic character set. [Recommend removal - not used]

3.1.24
namespace
optionally-named entity that can contain scoped declarations of any kind of entity [ensure that this is put into clause 4 on general concepts]

3.25

object

region of data storage in the execution environment, the contents of which can represent values and that can be interpreted as having a particular type

3.1.26

overload

the use of the same symbol name to denote different entities

3.1.27

override

replacing the implementation of an inheritable function in a derived class

3.1.28

parameter

(rewrite) See actual argument, argument, formal parameter (Non-responsive, needs definition)

[Remove - obvious]

3.1.29

{Protected]{.ul}

visible only to itself and derived classes and friends

3.1.30

private

visible only to the class itself and friends

3.1.31

Public

visible without restriction

3.1.33

recommended practice

specification that is strongly recommended as being in keeping with the intent of the language standard, but that may be impractical for some implementations

[recommend removal - not used] **3.1.34((

runtime-constraint

a constraint imposed on an executing program [TBD - needs refinement] [continue from here - 7 August 2023]

3.1.35

single-byte character

bit representation that fits in a byte (binary representation?)

3.1.36

static

TBD

**3.1.37

STL

TBD

3.1.38

standard library

[TBD]

3.1.39

template

TBD

3.1.40

trap representation

object representation that need not represent a value of the object type

3.1.41

undefined behaviour

use of a non-portable or erroneous program construct or of erroneous data, for which the language standard imposes no requirements

Note: Undefined behaviour ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). An example of, undefined behaviour is the behaviour on integer overflow.

3.1.42

unspecified behaviour

use of an unspecified value, or other behaviour where the language standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance

Note: For example, unspecified behaviour is the order in which the arguments to a function are evaluated.

3.1.43

unspecified value

valid value of the relevant type where the language standard imposes no requirements on which value is chosen in any instance

Note: An unspecified value cannot be a trap representation.

3.1.44

value

precise meaning of the contents of an object when interpreted as having a specific type (specific type or specified type?)

Note: See implementation-defined value, indeterminate value, unspecified value, trap representation

3.1.45

virtual

TBD

3.1.45

wide character

bit representation capable of representing any character in the current locale

4. Language concepts

This clause requires a rewrite. See C++ Core Guidelines CPL for a good explanation of the differences.

4.1 Overview

THIS REQUIRES MORE WORK

Define unchecked (random) access in clause 3 or explain C++ approach. Likely needs a new subclause. Indexing into raw memory is a random access with no checking. In the STL, the [] operator does random access without checking. The function at was added to provide range checking, including throwing an exceptiion if the check fails.

C++ is a strongly- and statically-typed language: all variables and expressions must have a type. C++ also permits implicit and explicit conversions between types.

C++ has a rich type system with many nuances. In addition to the C base types (int, long, float, double, char, and arrays with their C-style vulnerabilities), C++ provides the following:

Many vulnerabilities can be mitigated more easily by using library facilities rather than the base language types. (e.g. std::string rather than char*)

Narrowly tailored number-like class types, such as time_point and duration, improve safety by providing only safe and appropriate operations. User-defined types tailored to a particular use case can provide additional safety.

C++ was initially defined as a syntactic superset of the C programming language: adding object oriented features such as classes, encapsulation, dynamic dispatch, namespaces and templates. It was a “syntactic superset” because whilst there is a core of C++ that is syntactically identical to C, it has always been the case that there are subtle semantic differences between the two, for example:

struct S1 {
   struct S2 {...} m1;
   ...
};
struct S2 v1; /\* legal in C not C++ \*/
S1::S2 v2 // legal in C++ not C

Subsequently, the two languages have diverged, both adding features not present in the other. Not withstanding that, there is still a significant syntactic and semantic overlap between C and C++, so the starting point for this report has been the equivalent for C. However, in many cases, the additional features of C++ provide mechanisms for avoiding the vulnerabilities inherited from C, and these are reflected in the following sections.

Include discussions of Object orientation, static, and const, scoped enumerations

4.2 Type System

Mix-ins (6.20, 6.40, 6.41)

Inheritance

C++ supports user-defined class types with one or more base classes. Dynamic polymorphism requires public bases defining virtual member functions.

C++ provides access specifiers that allow inheritance to restrict the visibility of inherited members in their subclasses.

A member declared in a class hides all members of the same name in any of its base classes. Such names can be reintroduced by the using declaration. For example:

    struct Base {
       int f(int i);
    };

    struct Derived : public Base {
       // using Base::f;
       int f(char c);
       int g() {
          return f(123); // Surprise, f(char) is called
      }
};

If the using declaration is uncommented above, then Base::f(int) is called.

For multiple inheritance users can request virtual inheritance, which causes data members inherited from the same ancestor class along multiple inheritance paths to be present only once; otherwise the data members are replicated and referring to them is ambiguous unless qualified by the name of the base class from which they are inherited. When members of equal name are inherited from multiple sources, C++ rejects an unqualified use of the common name as ambiguous as long as they are not hidden.

THIS REQUIRES MORE WORK

A particular area that is misunderstood is integral promotion. It can be confusing because promotion can result in changing the internal representation of an unsigned type to/from a signed type. For expressions formed with operands of unscoped enumeration type or integral types with a conversion rank smaller than int, integral promotion occurs before further implicit conversions happen. Integral promotion on unsigned types can convert these to the signed type int. Undefined behavior can occur due to signed integer arithmetic overflow even when the operands are of an unsigned type. Assume for example, - signed and unsigned short occupies 16 bits - signed and unsigned int occupies 32 bits then the following code causes undefined behavior:

   unsigned short const x = 0xfff0;
   x * x;                           // signed integer overflow, result will not fit in signed 32 bit int

Note: C++ also uses the term promotion to apply to a subset of conversion that apply to intergral and floating point types.

Implicit, i.e., automatic, conversions to a type T can be performed, for example, in the following situations:

  1. If the declaration, T t=e;, is defined for some expression, e, and some invented variable, t [C++17, Clause 7 [conv], para 3];

  2. In expressions involving operands of operators (e.g., +, -, *, /, etc.) subject to the requirements of each operators' operands [C++20, Clause 7.3 [conv], para 2.1];

  3. For example, the expression, 5 + 6.5, has operands of type int and double. Per language rules, the integer operand will be implicitly converted to double, i.e., the expression becomes double(5) + 6.5, i.e., 5.0 + 6.5.

  4. In boolean contexts, such as

  5. In the expression of a switch statement: the implicit conversion will be to an integral type [C++17, Clause 7 [conv], para 2.3];

  6. In an expression that initializes an object (e.g., an argument to a function call, the expression in a return statement) [C++17, Clause 7 [conv], para 2.4];

  7. When a non-explicit class/struct/union constructor can be invoked on an object resulting in some desired type, T, from initial objects passed to the constructor; and

  8. When a conversion operator has not been declared explicit, it can be implicitly invoked on an object resulting in some desired type, T, from an initial type.

4.2 Symbol Lookup and Overload Resolution

THIS REQUIRES MORE WORK

scopes, names, ADL, using

Add to clause 4 “Language concepts” an issue on C++ symbol lookup issues considering the following: 1. Minimize the set of names that are available to avoid referring to an item that was not the intended target 2. hidden friends 3. ADL (argument dependent lookup) 4. Lack of this-> in class templates 5. Minimizing lexical scope and visibility 6. Minimizing names in global scopes 7. Minimizing use of “using” directives


namespace NS {
template <typename T>
struct A
{
   template <typename S, typename Q> friend void add (S, Q);
};
}

struct B {
    B(int);
    B (NS::A<int> const &);
};

B & add(B const & lhs, B const & rhs);

void bar (NS::A<int> & a, B const & b)
{
    add(a, b);
}

A C++ program may span multiple scopes, and when a function is called, the compiler may have many choices to select from to resolve the actual function that will be called. The set of names found for a function depend upon whether the name is qualified or if it is unqualified.

Unqualified lookup begins the search in the current scope and works its way out to the global scope and stops when it finds the first matching name. The search will include: - Names directly visible; - (Non-dependent) base classes if the scope is for a member; - Names introduced by the using declaration; - Names introduced by the using directive; and - Members of inline namespaces After these steps, Argument-Dependent Lookup (ADL) occurs. In ADL, the scope(s) of type of the function’s arguments are also searched. This search also includes friends that would not normally be found. ADL never finds member functions.

Unqualified name lookup in templates has two phases: - When the template is initially parsed, names are looked up as above, excluding dependent bases; and - When the template is instantiated, ADL is performed for function names, but ADL does not find member functions.

A qualified name lookup takes place when: - the name includes its scope (e.g a::foo); or - the name is used in a class member access expression (this->foo). Qualified lookup only searches: - Names directly visible; - (Non-dependent) base classes if the scope is for a member; - Names introduced by the using declaration; - Names introduced by the using directive; - Members of inline namespaces; and - Names that can be resolved when a template is instantiated.

Overload resolution

4.4 Object Lifetime

Currently appears in: 6.11.PointerTypeConversion-HFC.md 6.13.NULLPointerDereference-XYH.md 6.14.DanglingReferenceToHeap-XYK.md 6.22.InitializationOfVariables-LAV.md 6.33.DanglingReferencesToStackFrames-DCM.md 6.38.DeepVsShallowCopying-YAN.md 6.39.MemoryLeakAndHeapFragmentation-XYL.md 6.61.ConcurrentDataAccess-CGX.md 6.63.ProtocolLockErrors-CGM.md 6.65.ModifyingConstants-UJO.md

dangling.

4.5 Initialization

C++ provides a number of ways that an object can be initialized - Value initialization, e.g. std::string s{}; - Direct initialization, e.g. std::string s(“hello”); - Copy initialization, e.g. std::string s = “hello”; - List initialization, e.g. std::string s{‘a’, ‘b’, ‘c’}; - Aggregate initialization, e.g. char a[3] = {‘a’, ‘b’}; - Reference initialization, e.g. char& c = a[0]; - Zero initialization - Default initialization copy, move, default constructor

C++ has many forms of initilization, which generally guarantee that subsequent access to the declared object will be well-defined, however C++ does not always guarantee that it contains a legal value unless the programmer explicitly initializes it with a value. See ##6.22 Missing initialization of variables [LAV]#LAV.

The kind of initialization that happens in C++ when an explicit initializer is not used depends on the context of the declaration.

   int i; // zero-initialized in global name space, 
    void foo()
   {
       int i; // default initialized, inderterminate value
       int j{};  // value-initialized to zero
    }
 

Non-local variables with static storage duration that are dynamically initialized can cause undefined behavior if the initialization depends on other such variables, see ##6.22 Missing initialization of variables [LAV]#LAV.

The lifetime of most objects of a class type begins when the object’s constructor has completed. In some situations binding a reference to a temporary will extend the lifetime of the temporary. See ##4.4 Lifetime and ## 6.33 Dangling References to Stack Frames [DCM]. // In any case, an object should not be accessed until its initialization is complete (including concurrency???). Richard C thinks that https://eel.is/c++draft/basic.start.dynamic#5 means that there is no danger of one thread accessing a dynamically initialized global variable before the initialization has taken place - is that the case?

When a function is called, each parameter is initialized with its corresponding argument. The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter, see ## 6.24 Side-effects and Order of Evaluation of Operands [SAM]. On the other hand, the value computation and side effects of the initializer-clauses in an initializer-list are evaluated in the order they appear.

There are many ways for a user to construct, or initialize, an object.

4.6 UndefinedBehavior.md

TBD

4.7 Error Handling

The C++ language and standard library provides several mechanisms for error handling:

Be aware, that many parts of the standard library specify preconditions and undefined behavior results if those aren’t met, for example, std::vector<int>::front() requires a non-empty vector, when called.

Exceptions allow for errors to be propagated up the call chain. Even though the standard library provides a type hierarchy derived from std::exception, any copyable type can be thrown. Throwing an exception due to a detected error situation allows the error to be handled at an appropriate level in a corresponding catch block. As the exception propagates to its handler, local objects are destroyed appropriately in reverse order of their construction; this mechanism is known as stack unwinding. A search for a matching handler stops at

An exception propagated from constructors of non-local variables and destructors of variables with static storage duration can never have a matching handler.

Failing to provide a matching handler on the call chain for an exception thrown causes a call to std::terminate() and the program terminates.

When calling non-returning program termination functions like abort(), std::terminate(), or exit(), the program terminates without stack unwinding.

#TODO: shorten again by moving to 6.36? talk about coroutines… see what Paul finds out

An exception propagating out of a coroutine causes the coroutine to end in an unresumable state and the exception is not further propagated.

In addition to that list, an operating system might use the “signal” mechanism to notify a running C++ program. A signal-handler can be defined to act asynchronously upon a sent signal that isn’t ignored.

4.8 Concurrency

C++ includes concurrency within the language, expressed by threads and tasks. Threads are sequences of execution that can be executed concurrently with the entity (thread) that created them, and with each other. There are good reasons to use threads in a C++ program:

  1. The running program, or a running thread, block for real-world events, such as awaiting input, awaiting completion of a system-level event, or communications with non-local systems.

  2. Threading permit other parts of the program to continue execution even while one or more parts are blocked, or lets a program to await and respond to sets of events in the order that they are received.

  3. Threading lets the program make effective use of multiple cores, proving significantly more computing power to a program.

Threads are initiated by calling std::threads constructor. TODO: more from 6.59

A thread in C++ runs until completion, either a normal completion or as the result of an unhandled exception. There is no mechanism in the language to terminate another thread.

C++ threads use a fork-join model. This means that the initiating thread will wait for the completion of the initiated thread at the join place; otherwise the initiating thread will have no indication of when the created thread completes.

The thread is then initialized and begins execution on its sequence of instructions. A thread can be joined, i.e. the joining thread awaits the completion of the joined thread, or a thread can be detached.

Threads share data and events via atomic variables, condition_variables, futures, and mutexes.

Threads terminate when they complete the execution of the function that was named at thread initiation.

C++ also has the notion of light weight concurrency in the form of tasks. These tasks are created by calling the std:packaged_task with a function, lambda expression, bind expression or another function object. It is expected that the results of a task execution is collected at the end of that execution by calling get_future (t) and waiting for that/those completion(s).

In addition, C++ programs can interact with other programs executing in a system using operating system-level calls to initiate, schedule, communicate and destroy/terminate itself or others.

There are a number of significant vulnerabilities associated with concurrency, which are described in clause 6.59 through 6.63 of this document.

5. Avoiding programming language vulnerabilities in C++

In addition to the generic programming rules from ISO/IECTR 24772-1 clause 5.4, additional rules from this section apply specifically to the C++ programming language. The recommendations of this section are restatements of recommendations from clause 6, but represent ones stated frequently, or that are considered as particularly noteworthy by the authors. Clause 6 of this document contains the full set of recommendations, as well as explanations of the problems that led to the recommendations made.

Every guidance provided in this section, and in the corresponding Part section, is supported by material in Clause 6 of this document, as well as other important recommendations.

TBD

Index Reference[]{custom- style=“annotation reference”}
1
2
3
4
5
6

Need to consider C++-11, 14 and 17.

6. Specific Guidance for C++ Vulnerabilities

6.1 General

This clause contains specific advice for C++ about the possible presence of vulnerabilities as described in TR 24772-1, and provides specific guidance on how to avoid them in C++ code. This section mirrors TR 24772-1 clause 6 in that the vulnerability “Type System [IHN]” is found in 6.2 of TR 24772–1, and C++ specific guidance is found in clause 6.2 and subclauses in this TR.

As part of its design (and with few exceptions), C++ has a common subset with the complete C language. For code portions written in the common subset, the vulnerabilities described and the advice given in ISO/IEC TR 24772-3:2020, Part 3 – Vulnerability descriptions for the programming language C, apply, except when this document provides refined advice. The following subclauses usually do not further acknowledge the issues from the subset since those have been adequately addressed in the referenced document. However, C++ provides mechanisms to mitigate many of the problems that arise. Please refer to the respective clauses of this document for these mitigations and related guidelines.

6.2 Type System [IHN]

6.2.1 Applicability to language

C++ is a statically typed language. In some ways, C++ is both strongly and weakly typed, as it requires all objects/expressions to have a type, but allows for some implicit conversions of values from one type to another type. The following cases require special consideration:

double fluxcompensation(double flux, bool compensate){
  if (flux) { // double to bool conversion
    double delta = compute_delta();
    double const compensate_v = 1.4;
    return flux + delta * compensate; // bool to double conversion
  } 
  return 1.;
}

Note that type aliases (using, typedef) do not define a different type from their alias just a different name and thus do not incur any conversion between the alias and the aliased type.

Instead of using the built-in arithmetic types or generic library types such as std::string for your domain values, C++ allows to wrap them in user-defined-class types as so-called strong types. For integral values, enum class types can also be used. Strong types provide only those operator overloads and conversions for each such type that make sense in the application domain. User-defined-literal operators help with providing constants of appropriate strong types. Such strong types provide full control of conversions and operations available, avoiding semantically unsound operations that the built-in or other generic types might provide.

For example, a very simple strong type representation of temperature values can be implemented as follows:

struct Celsius {
    double value;
};
struct Fahrenheit {
    double value;
};
Fahrenheit convert_to_fahrenheit(Celsius c){
    return { 9*c.value/5+32};
}
//...
Celsius wrong = convert_to_fahrenheit({20.}); // doesn't compile

In a realistic scenario using a library for strong type support eases the definition and use of strong types.

6.2.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can: - Use the avoidance mechanisms of ISO/IEC TR 24772-1:2019, 6.2.5. and the guidance provided in the different related sections of this document.

6.3 Bit Representations [STR]

6.3.1 Applicability to language

This vulnerabilities described in TR24772-1 clause 6.3 is applicable to C++. The “endianness” of integer types and packing of bit fields are implementation-defined properties and not portable.

The standard library type std::endian allows to portably check the endianness of a platform and code can use this information to operate on individual bytes of a machine word in the correct order.

There is no portable mapping from bitfields in a struct to individual bits in a machine word. Therefore, C++ bitfields should not be used to directly map to bits in hardware, even though the compiler provides suitable mapping and manipulation operations. A further complication is that accessing a bitfield can often not easily be performed atomically, because the non-participating bits of a memory location need to be read before the relevant bits can be mutated through masking, and the whole memory location has to be written again. It is possible to simulate bitfields with a defined layout through library class types that implement the required masking operations.

For individual bits std::bitset<N> and std::vector<bool> can provide suitable representations at run time, but don’t support a direct mapping to machine words. However, be aware that std::vector<bool> does not in general behave like a std::vector which can cause generic code to misbehave.

C++ provides a rich set of bitwise operators that can be used to address the issues of bit manipulation in a portable way. However, the shift operation can result in undefined behavior when shifting by a negative or too large value, or when shifting a signed operand. It is advisable to use bit operations only on appropriate unsigned integral types with a known width while being careful of potential integral promotion that might change a small unsigned operand type to be promoted to a signed integer type. When bitwise memory operations are needed, it is good practice to encapsulate such operations in a class type’s member functions.

For representing individual bitmasks values employed in bit operations, it is advisable to put the corresponding named constants in an enumeration type with the appropriate underlying type for easier recall.

While a bit shift of an integral value can be viewed as a multiplication or division by a power of two, it should not be used in arithmetic expression to implement such an operation. Compilers will automatically implement such a multiplication in the most efficient way, there is no need to obfuscate multiplication and division as shift operations.

Except for specific situations (trivial types), objects of class type can not be assumed to have a layout appropriate to be manipulated on a byte or bitwise level. Depending on the size and alignment of its data members a class type might have padding bytes between members. The absence of padding in a trivially copyable type T can be checked with a static_assert(std::has_unique_object_representations_v<T>).

std::bit_cast can be used to reinterpret suitably sized trivial types on a bitwise level, i.e., for accessing the binary representation of a floating point value. However, for types with padding bits that do do not participate in an object’s value representation, the corresponding bits in a bit_cast result have indeterminate values. If those bits are used to compare for equality, with the function memcmp for instance, the padding bits may differ and cause false negatives.

Malicious code could use such padding bits as a secret channel which might be accessed through copying the underlying bytes.

See C++ Core Guidelines ES101 use unsigned types for bit manipulation.

6.3.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.4 Floating-point Arithmetic [PLF]

6.4.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1 clause 6.4 is applicable to C++.

The C++ standard assumes IEC 60559 if std::numeric_limits\<T>::is_iec559 is true for the types in use. In the absence of this, C++ makes few guarantees about the behaviour of floating point numbers. In particular std::less is not a total order; std::equal is not equivalent to substitutability (NaNs compare unequal to themselves, but neither less nor greater, and negative zero compares equal to positive zero).

Sorting floating point numbers with the built-in operators violates the preconditions of sorting predicates in the presence of NaN values and may raise floating point errors. The default sorting predicate std::less is suspect to this precondition violation, resulting in undefined behavior when sorting a range of floating point values that contains NaNs.

6.4.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.5 Enumerator Issues [CCB]

6.5.1 Applicability to language

The vulnerability documented in ISO IEC 24772-1 clause 6.5 applies to C++. C++ provides scoped (enum class) and unscoped (enum) enumeration types, where an underlying integral type can be specified. For enumeration types with a fixed underlying type all values of the underlying integral type are valid. Unscoped enumeration types without an enum-base have a non-fixed underlying type that only guarantees values in the range of the provided enumerators are valid. The latter can cause non-representable values to be assigned to a variable of such an unscoped enumeration type.

Assignment of a variable of an enum type require the assigned value to be of the same enum type. This can be either an enumerator of that enum type, or requires a cast (see 6.6 Conversion errors [FLC]).

C++ allows implicit conversion of an unscoped enum by integral promotion.

TODO continue here

enum Color  {red, green, blue};
short i = red; // implicit conversion
Color g { green + blue }; // integer result fits into short
Color h {42}; // OK non-narrowing

List initialization of enum types with fixed underlying type implicitly does a static cast.

C++ does not support implicit conversion of a scoped enum to an int, hence, operations such as ++, +, < and enums used as array indices require explicit definitions.

enum class Color : short {red, green, blue};
short i = red; // error -- no implicit conversion

Where unscoped enums are used as array indexes and do not have a user-specified mapping to an underlying representation, there will be “holes” as documented in TR24772-1 clause 6.6.

Note that unscoped enumeration types implicitly promote their underlying type and can be used as the index of an array without a cast, with all of the issues described in TR 24772-1 clause 6.5.

From C++ 2017 forward, casting a value to an enumeration type is undefined behavior unless the source value is within the range of values of an enumeration type. See CERT INT50-CPP.

6.5.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.6 Conversion Errors [FLC]

6.6.1 Applicability to language

The vulnerability as documented in ISO IEC 24772-1 clause 6.6 applies to C++. C++ includes some of the conversion mechanisms of C documented in TR 24772-3 clause 6.6.1, however C++ type conversion mechanisms differ from the mechanisms of C, as documented in ISO IEC 14882 Annex C. This subclause highlights differences where C++ provides mitigations of potential vulnerabilities found in C.

In C++, some conversions are explicit while others are implicit. Conversions can change the size of a type, whether or not the type is signed, and possibly other properties of the type. A narrowing conversion is when the target type cannot represent all the values of the original type. Many errors are associated with implicit conversions. For a comprehensive overview see clause 7.3 [conv] of [C++20]

Explicit conversions use one of the mechanisms provided by C++ through a

In C++, a C-style cast is defined in terms of the C++ cast operators const_cast, static_cast, and reinterpret_cast. In some cases, it is unspecified which cast is used, for example when a cast operation involves an incomplete type, a reinterpret_cast may be used for the conversion which can produce an incorrect result.

Unlike C++'s other cast notations, dynamic_cast relies on run-time type information generated by the compiler to ensure the requested conversion is valid. If it is not valid, then nullptr is returned for pointer types, otherwise an exception is thrown. [C++17, Clause 8.2.7 [expr.dynamic.cast]] Thus, dynamic_cast is safer to use when converting down a hierarchy where the base class has virtual member functions. (see Pointer Type Conversions [HFC] and Polymorphic Variables [BKK])

An implicit conversion to a class type can occur for a class with constructors that can be invoked with a single argument, as in the following example:

class C
      {public:
        C(int x=10, float y=0){...}
      };

void foo(C param){...}
void bar( bool b){ foo(b);} 

In the example above, it can be surprising that foo() is called with a boolean.

Note that this implicit conversion to a class object is the default behaviour of constructors that can be called with a single parameter. The explicit keyword can be used before the constructor to prevent this happening, as in:

explicit C(int x=10, float y=0){...}

The call foo(b) would now not be legal.

Implications of casting away const using const_cast are described in section Modifying Constants [UJO].

Other implicit conversions can sometimes result in data loss or erroneous values. This is an issue with implicit conversions since they are automatic: the programmer does not explicitly write code to do the conversion. For example, a common problem is mixing signed and unsigned integral types in arithmetic expressions. This can become a problem since the ranges of signed and unsigned integer types differ and the behaviour of signed integer arithmetic on overflow is undefined whereas unsigned integer arithmetic wraps on overflow. See subclause 4.2 for a discussion of integral promotions in C++.

The issue is not restricted to narrowing conversions, as shown below:

  long l_64 = i_32 + i_32; // '+' operation preformed in 32 bits
                         // widened after the operation completes (and potentially overflows). 

This can be avoided by converting at least one operand to the wider type as part of the operation. Note that auto directs the compiler to use the appropriate type based on the initializer expression. Subsequent use of the auto object (such as in standard mathematical operations) can lead to implicit conversions that are not obvious in the context local to the expression. Additional problems arise as a result of implicit conversions between bool and other types, thus hiding the fact when a wrong operator is used accidentally:

auto f(unsigned i, unsigned j)
{
  return (i > 1) & (j = 1); // (>>, &&, ==) ?
}

In the example above, all combinations of the corresponding operators will compile with different resulting types and results.

Similar issues arise in conversions between character types (char, char8_t, …) and other types. Character types are provided to represent text in whatever character representation is needed.

void f(char c)
{
  if (c < 0) // may be always false on some platforms.
    {}
}

In addition to the use of strong types (see Type System [IHN]), the implicit conversions and multitude of possible operations of integral types can be mitigated by using scoped enumeration types with the corresponding integer type as its underlying type. For example, std::byte is defined to address individual unsigned char elements (bytes) in memory without participating in arithmetic or bitwise operations.

Because C++ allows function and operator overloading, the effect of implicit conversions provides an additional mechanism of failure, by selecting an unwanted overload during overload resolution due to implicit conversions. This can influence failure modes with lookup as described in section Namespace Issues [BJL]. // Add overload resolution reference!!

C++ also provides a library function std::bit_cast. This function provides the ability to preserve the bit representation when converting between unrelated types. If such is meaningful, then std::bit_cast reduces the risk of some undefined behaviours compared with other type punning approaches such as casts or unions.

6.6.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.7 String Termination [CJM]

6.7.1 Applicability to language

The vulnerability as documented in ISO/IEC TR 24772-1:2019 exists in C++ when C-style strings are used, e.g., with interfaces that require NUL-terminated strings. C++ provides alternative string processing capabilities that do not exhibit those vulnerabilities.

C++ provides a class template for string processing, std::basic_string that manages the space for the string and the string length and always includes a string termination character. For example, when concatenating, the std::basic_string object will increase in size to contain the resulting string. Futhermore, as the string is guaranteed to have a string termination character, using its underlying raw pointer as a C-style string will mitigate this vulnerability because the string termination character is present.

C++ provides the library class templates std::basic_stringview and std::span that implement reference semantics to non-owned buffers. These types do not rely on a string termination character to determine the length of the string, thus, use of these types avoids those vulnerabilities. However, using its underlying raw pointer as a C-style string can result in these vulnerabilities because the string termination character is not guaranteed to be present.

void foo(std::string const& s, std::string_view const& sv)
{
  puts(s.data());  // okay string has termination character
  puts(sv.data()); // not okay; string termination not guaranteed
}

6.7.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.8 Buffer Boundary Violation [HCB]

6.8.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1 clause 6.8 exists in C++ when arrays are managed using raw pointers or indexing. The range of valid raw pointers to a plain array a are from the first element to one past the last element of the array, i.e., in the range [std::begin(a)..std::end(a)). An object o can be treated as a single element array with respect to pointers referring to it.

C++ provides facilities to encapsulate code that is exposed to this vulnerability. The standard library defines features that mitigate or circumvent this vulnerability. For example, std::string, std::vector, std::deque, and iostreams manage buffers internally; using “range-for” such as for (auto &e :some container) and the algorithm library to access elements e of the container without the possibility of a buffer boundary violation.

However, the member function data() of the contiguous sequence containers returns a non-const pointer to the underlying elements. This allows manipulating the underlying memory directly, bypassing the safety features of the container leading to this vulnerability. For example, std::string::data() returns a non-const char*.

When working directly with iterators referring a container, one need to ensure that those iterators are and remain valid. For example, for a container c incrementing an iterator beyond the end(c) iterator or dereferencing the iterator denoted by end(c) are undefined behavior.

In general, validity of iterators requires programmer care to prevent out-of-bounds access of the underlying container:

For example, using algorithms and iterators correctly to convert an input string to lower case:

std::string to_lowercase(std::string_view s){
    std::string result{};
    transform(
        begin(s), end(s), // input range #1
        std::back_inserter(result), // output iterator #2
        [](char c){ return std::tolower(c);});
    return result;
}

The above example, passes two ranges of characters to the transform algorithm. Potential errors due to a boundary violation could be caused by the following changes:

The second problem occurs in the following code if the length of s is longer than 31:

std::string to_lowercase(std::string_view s){
    std::string result{'\0', 31};
    transform(
        begin(s), end(s), 
        begin(result), // error, only space for 31 characters 
        [](char c){ return std::tolower(c);});
    return result; // size(result) == 31
}

An additional problem occurs when performing an operation that invalidates an in-use iterator, such as the iterator internally used by the range-for statement below:

std::string to_lowercase(std::string s){
    for (auto &c:s){
       s.append(std::tolower(c)); // error, invalidates in-use iterator
    }
    return s;
}

Another way that overflows can occur is through the use of C-style strings, which can be treated as arrays of characters, but mishandling of the nul termination can make overflows possible. See clause 6.7 String Termination[CJM].

Since plain (C-style) arrays when passed as function arguments decay to pointers the array dimension is lost. C++ provides several means of keeping the array dimension available to the called function:

For further explanation and examples, see

6.8.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.9 Unchecked Array Indexing [XYZ]

6.9.1 Applicability to language

The vulnerability as documented in ISO/IEC TR 24772-1:2022 6.9 exists in C++ when an access is performed using operator[].

C-style arrays, C-style pointers, random-access iterators, and some standard library containers allow element access via operator[] which is unchecked. However, those standard library containers also provide an access function at() that behaves like operator[], but performs a check that the access is within the bounds of the container and throws an exception otherwise.

For issues associated with exception handling and error handling, see clause 6.36 Ignored error status and unhandled exceptions.

The parameter type for contiguous sequences std::span does not provide a checked version of indexing and therefore should only be used via its iterator/range API.

6.9.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.10 Unchecked Array Copying [XYW]

6.10.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1:2022 exists in C++, but can be mitigated using features provided by the language.

A buffer overflow occurs when some number of elements is copied from one buffer to another and the amount being copied is greater than is allocated for the destination buffer. This is a special case of 6.8 Buffer Boundary Violation [HCB]. The C library functions or hand-written loops for copying bytes or C-style strings are especially prone to this vulnerability.

As with clause 6.8 [HCB], in most cases the vulnerability can be avoided by using library classes, such as std::vector or std::string, which provide a copy operations operator that adjust the size of the target to fit the object being copied.

The standard library algorithms that copy into a target range can suffer from this vulnerability. In the case of potential overflow, the programmer must either ensure automatic extension of the underlying container, such as by using std::back_inserter(container) as the output iterator, or ensure that the output range has sufficient space available. In the case of overlapping input and output ranges, the suitable copying algorithm must be selected, depending on the relative ordering of the ranges. In general, this situation can be avoided by using a more appropriate algorithm, for example, std::rotate.

For arrays with fixed sizes the assignment operator or copy-constructor of std::array is the means of safe array copying.

If a system requires its own container types with dynamic size, a naïve implementation might attempt to keep the copying external, like with C-style arrays. Such external copying should be avoided.

6.10.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.11 Pointer Type Conversions [HFC]

6.11.1 Applicability to language

The vulnerabilites as described in ISO/IEC TR 24772-1:2019 clause 6.11.1 applies to C++. In addition to pointers, C++ references are also vulnerable and the issues below include references when pointers are mentioned. In places where references cannot be substituted the corresponding code won’t compile.

In general casting pointers breaks the type system and should be avoided.

In C++, a C-style cast is defined in terms of the C++ cast operators const_cast, static_cast, and reinterpret_cast. In some cases, it is unspecified which cast is used, for example, when a cast operation involves an incomplete type, a reinterpret_cast may be used for the conversion which can produce an incorrect result.

Especially, reinterpret_cast has the problem that it takes the original pointer value as a pointer of the target type rather than the original type. The C++ standard defines most cases where that happens as undefined behavior. For example, the lifetime model of C++ might result in accessing the target type object outside of its lifetime. Other run-time issues can be caused by alignment violations. Using reinterpret_cast<std::byte*> to access the underlying memory of an object by casting its address permits access to the raw memory. However, casting the address of a piece of raw memory with the correct alignment and size to an object pointer and accessing that object is undefined behavior for most types, because doing so, will not start the lifetime of the object.

static_cast only works, where conversion of the source type to the target type are related. However, with pointer types the compiler cannot always check that the actual object type corresponds to the desired target type, causing invalid casts. Naïvely assuming that addresses of a derived object and its base object are identical is wrong in most cases. For example, with multiple inheritance, the address of an object may be different than one of its base class sub-objects. Using the generic pointer type void* (which is common in C APIs) allows converting between arbitrary pointer types using static_cast. Most conversions via void * where the originial object type and the final target type are different are undefined behavior in C++. C++ allows reinterpret_cast to a pointer to an incomplete type or a static_cast from void * to a pointer to an incomplete type. Pointers to objects can implicitly convert to void * (cv-qualified accordingly).

It is only defined to reinterpret cast the obtained pointer back to the original type. It is implementation-defined if that bidirectional void * conversion also works for function pointers. A reinterpret cast can be used to convert a pointer from the integral types std::uintptr_t/std::intptr_t, but only if the value of the integer value was previously obtained by converting a valid pointer to said integral type. Casting an arbitrary integral value to a pointer is undefined behavior.

Casting along the inheritance relationships with dynamic_cast is safe, but it requires the dynamic type is known, which is the case when the types declare virtual member functions. Within a constructor or destructor only the static type of the current class is relevant, because the lifetime of any derived class object hasn’t started or has already ended. See subclause Polymorphic Variables [BKK].

Conversions involving const and/or volatile properties of a type are permitted using const_cast (see Modifying constants [UJO]). Adding const with const_cast is safe.

6.11.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.12 Pointer Arithmetic [RVG]

6.12.1 Applicability to language

The vulnerabilites described in ISO/IEC 24772-1:2022 clause 6.12.1 also apply to C++. The vulnerabilities caused by out-of-bounds access are covered in clause 6.8.

Pointers to functions, pointers to members, and pointers to void do not allow pointer arithmetic.

The set of valid pointers referring to an array consists of the pointers to each array element plus the pointer just past the end of the array, however dereferencing a pointer one past the end of the array is undefined behaviour. A pointer to a single object is considered to point to an array of size one with respect to pointer arithmetic.

Adding or subtracting an integral value to a pointer value must yield a result that is a valid pointer refering to the same array, otherwise the behavior is undefined. Note: the built-in indexing operator is defined in terms of pointer arithmetic.

Subtraction of two pointers has undefined behaviour unless both pointers refer to the same array or are both null.

Comparison of two pointers with one of the operators < > <= >= <=> has unspecified behaviour unless both pointers refer to the same underlying array or object. The standard library function objects for comparison, like std::less<>, provide a strict total order of pointers of a given type.

Iterators as defined by the standard library suffer from similar vulnerabilities as pointer arithmetic. Comparison and subtraction of two iterators, or computing std::distance cause undefined behavior for iterators that refer to different ranges. Forming an iterator that is outside of its underlying range similarly is undefined behavior.

6.12.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.13 NULL Pointer Dereference [XYH]

6.13.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.13 exists in C++. Dereferencing a pointer with the value of nullptr is undefined behavior [EWF].

Using pointers is inherently problematic especially for function parameters and return values, due to the following issues:

Using values instead of pointers sidesteps all pointer vulnerabilities, especially when returned from a function. For example, standard library containers like std::vector have value semantics and do not suffer from this vulnerability.

C++ references cannot be null in a well-defined program and solve the null-dereferencing vulnerability. They are particularly useful as function parameters. Using a reference as function return type requires the caller to avoid accessing an object outisde of its lifetime (see [XYK], [DCM]).

If absence of a value is necessary, a class type for optional values such as std::optional provides well-defined behaviour and single-object ownership. In case of attempting to access the value of an empty std::optional an exception is thrown.

Note: Be aware that optional<T&> is not supported by the standard library. For representing optional references std::optional<std::reference_wrapper<T>> or a non-standard implementation of optional supporting references can be used.

If dynamically allocated objects are required, std::unique_ptr<T> can be used for lifetime-management and for transferring ownership. When shared ownership of such objects is necessary, std::shared_ptr<T> is a solution. Using std::shared_ptr<T const> provides value semantics for immutable heap-allocated objects thus sidestepping most of the issues of pointers above. Constructing a smart pointer through the factories std::make_unique or std::make_shared will return a non-null smart pointer or throw an exception and thus prevent the vulnerability of null pointers, in contrast to legacy allocation mechanisms and some overloads of operator new. However, in general dereferencing a std::unique_ptr or std::shared_ptr equal to nullptr causes undefined behaviour, for example, when such a smart pointer is default constructed or a std::unique_ptr is in a moved-from state. For further information see also [XYL].

6.13.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.14 Dangling Reference to Heap [XYK]

6.14.1 Applicability to language

The vulnerability as expressed in ISO/IEC TR 24772-1:2019 and ISO/IEC TR 24772-3:2020 C exists in C++. C++, however, provides mechanisms to mitigate the vulnerability. In contrast to C, where the mere existance of reachable memory for an object is sufficient to access it, the lifetime model of C++ makes it undefined behaviour (see subclause [EWF]) to access an object outside of its lifetime. This results in undefined behavior, when an object access is attempted before one of its constructors is finished or after its destruction. For example, container types like std::vector or wrapper types like std::optional might have memory for an object available, that is not constructed or has ended its lifetime. For similar situations that result from accessing temporary objects or variables outside of their lifetime see subclause [DCM]. If such a temporary or local object manages heap memory (e.g., std::vector) referring to an element after the manager’s lifetime ended technically falls into the category of this vulnerability, but is covered there.

C++ provides a rich set of pointer-like types (potentially referring to heap memory) whose values may dangle, e.g.,

In addition, a user-defined class type can be a pointer-like type, if a subobject is of pointer-like type and refers to an object (target) whose lifetime is different from and not managed by the current object. Sometimes, regular object types act as pointer-like types, e.g., indices into a container or operating system handles, and their validity can not be directly mapped to the C++ object lifetime model.

If the lifetime of a pointer-like value ends before the lifetime of its target, then the vulnerability does not apply to that pointer-like value. This is the primary C++ strategy for avoiding vulnerabilities of dangling pointer-like values. For example, an object argument passed as a function parameter of reference type persists throughout the function call. The lifetime guarantee of a function argument passed indirectly via a pointer-like type does not apply if * the target is destroyed explicitly by the called function (taking ownership of the target) or a concurrently executing operation, or if * copies of the pointer-like parameter outlive the function call, for example, as the return value, or in a coroutine or thread frame.

For objects directly allocated on the heap C++ provides smart pointers and corresponding factory functions (e.g., std::make_unique()) that allow transferring ownership or shared ownership to reduce the risk for dangling. However, storing the raw pointers managed by smart pointers can lead to accidental dangling, for example:

int * f(){
    auto up = std::make_unique<int>(42);
    return up.get(); // returned pointer dangles
}

The C++ library containers, such as std::vector, manage the required heap memory for their elements. Referring to an element in a container via a pointer-like type is safe, as long as the container remains unchanged while the element object is accessed. In general, accessing an element in a mutated container via a pointer-like value obtained before the mutation is undefined behavior. Different containers provide different validity guarantees of accessing an element via a pointer-like type that was obtained before a subsequent change in that container.

Hand-written loops are prone to attempt to access elements of a container that are non-existent, or have been relocated. Employing standard library algorithms to iterate over a range of elements from a container tends to be safer, as long as the underlying container is not accidentally changed. For example, the following code can cause a failure, due to the attempt to iterate over a changing std::vector:

std::vector v{1,2,3,4};
copy(begin(v),end(v),back_inserter(v)); // modifying v while iterating is undefined behaviour
copy(begin(v),end(v), std::ostream_iterator<int>(std::cout,", "));

6.14.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.15 Arithmetic Wrap-around Error [FIF]

6.15.1 Applicability to language

C++ has the vulnerability as documented in ISO/IEC 24772-1 clause 6.15, since overflow situations are undefined behaviour for signed integer arithemtic and wrap-around for unsigned integer arithmetic, which can lead to surprising results. C++ specifies that

For example, integral promotion happens when multiplying two unsigned short operands which can result in undefined behavior:

auto f(){
  std::uint16_t x{50'000},y{50'000};
  return x * y; // undefined behaviour due to overflow, returns int
}

In the above, for a 16-bit short and a 32-bit int, i.e., std::numeric_limits<int>::max()==0x7fff'ffff, x and y are promoted to int and the multiplication then overflows which is undefined behaviour.

Even when operands have the same unsigned type, wrap-around arithmetic can be confusing, for example, 4U - 5U yields a large positive value.

Calling a function taking a parameter of integral type with an argument of different integral type works due to implicit conversions. If a different overload with a better match becomes visible the called function can change when re-compiled (see 6.21 Namespace Issues[BJL])

Using brace-initialization prevents implicit narrowing conversions in contrast to other forms of initialization. For example:

std::uint16_t x{500'000};  // won't compile due to narrowing
std::uint16_t y = 500'000; // compiles, but truncates value

The mitigations for wrap-around errors in C++ are different than for C. The type system of C++ allows user-defined class and enum types with corresponding overloaded operators. Such user-defined types can individually control which implicit conversions or mixed type arithmetic they support, if any. For example, one can force arithmetic to be done with unsigned types:

enum class uint16: std::uint16_t{};
uint16 operator*(uint16 a, uint16 b){
  return static_cast<uint16>(static_cast<unsigned>(a) * static_cast<unsigned>(b));
} // guarantee wrap-around

High-integrity software using the built-in integral types should

6.15.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.16 Using Shift Operations for Multiplication and Division [PIK]

6.16.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1 clause 6.16 exists in C++. C++ complicates the discussion in 24772-1 clause 6.16 as a result of the integral promotion (see clause 6.06 [FLC]). A left-shift on an operand that gets promoted can result in a value outside the operand’s unpromoted type’s range.

Not every use of a shift operator is a bit-shift due to operator overloading.

6.16.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.17 Choice of Clear Names [NAI]

6.17.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.17 is applicable to C++, as it is susceptible to errors resulting from the use of similarly appearing names. However, the language rules prevent using an identifier that has not been declared. There are two possible issues: the use of the identical name for different purposes (see clause 6.20 Identifier Name Reuse [YOW]) and the use of similar names for different purposes.

C++ permits the use of names such as x, xx, and xxx, possibly defined in non-obvious scopes, and a programmer can easily, by mistake, write xx where x or xxx was intended. Especially for overloaded functions, argument-dependent-lookup might find a function in a scope that the user did not consider. The use of the wrong name will typically result in a failure to compile so no vulnerability will arise. However, if the wrong name has a type compatible with the intended name’s type, then an incorrect executable program will be generated.

C++ defines reserves some names as context-specific keywords. While it is technically possible to use those names for other purposes, such use can be confusing.

In the global scope some namespaces (such as std, posix) are reserved and should not be used otherwise.

6.17.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.18 Dead Store [WXQ]

6.18.1 Applicability to language

The vulnerability as documented in ISO/IEC TR 24772-1:2019 clause 6.18 exists in C++.

The language definition permits the compiler to eliminate effects of the abstract machine that are not observable, in particular, dead stores. For example, the often-attempted write operations to non- volatile member variables in a destructor that are not subsequently used can be elided by the compiler.

C++ compilers and static analysis tools do exist that detect and generate warnings for dead stores.

The error in ISO/IEC 24772-1:2019 subclause 6.18.3 that the planned reader misspells the name of the store is possible but unlikely in C++ since the language specifies that all objects shall be declared and typed, and the existence of two objects with almost identical names and compatible types (for assignment) in the same scope would be readily detectable. See 6.17 [NAI] Choice of clear names, 6.20 Identifier name reuse [YOW], and 6.21 Namespace issues [BJL]

6.18.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.19 Unused Variable [YZS]

6.19.1 Applicability to language

The vulnerability as documented in ISO/IEC 24772-1 clause 6.19 exists in C++.

A common practice for resource management in C++ relies on what is called “RAII” or “SBRM” (scope-based resource management): employing a class’ destructor to release resources managed by the object. This can lead to code without visible use of a variable being present in the source code, because all work is done by the variable’s constructor and/or destructor.

6.19.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.20 Identifier Name Reuse [YOW]

6.20.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.20 exists in C++, except for the second issue of limited identifier length. In C++ all characters in an identifier are significant.

C++ provides the scope resolution operator ::{.cpp} to access identifiers from non-local scopes.

Overloading and specialization of functions is a cornerstone of C++ generic programming. In this context, the reuse of function names is essential. See clause 6.41 for inheritance issues associated with name reuse.

Overloaded function names and operators considered in an expression are not restricted to a simple scope hierarchy, because of argument-dependent lookup (ADL). In generic code the unqualified function or operator selected can come from a scope based on the type of the arguments and not from the current scope hierarchy. The rules for which namespaces are eligible for lookup of unqualified functions and operators are intricate, but required to make overloaded operators work.

In addition, if implicit conversions can happen on arguments, the overload selected by ADL can be different from programmer expectation even in non-generic code, especially when an argument is of a type that can be implicitly converted to another type where a corresponding overload is defined. Visibility on a namespace-level of such an operator overload may make it eligible, even if neither argument matches the parameter types directly. In the best case this leads to a compile error due to ambiguities, but it can also result in perfectly compiling code executing an unexcepted overload.

The following example demonstrates part of the problem:

#include <iostream>
#include <typeinfo>

namespace Y {
template <typename T>
void print(T i){
    std::cout << typeid(T).name()<< ":" << i ;
}
template <typename T>
void println(T x){
    print(x); // expects to call Y::print
    std::cout<<'\n';
}
} 
namespace X {
   struct A{
        A(double){}
        friend // make this a hidden friend
        std::ostream & operator << (std::ostream & out, A const &a){
            return out << "An A as expected\n";
        } 
    };
    void print(A a){ // not expected to be called by println
        std::cout << "Surprise happens!";
    }
}
int main(){
    X::A a{3.14};
    Y::println(42); // i:42 - calls Y::print
    std::cout << a; // An A as expected - calls X::operator<<
    Y::println(a);  // Surprise happens! - calls X::print
    Y::println(42u);// u:42 - calls Y::print
}

The above code calls the overload print(A) from println since it is pulled in by ADL. On the other hand, ADL is required to work to allow the output operator for type X::A to work.

The consideration of implicit conversions together with ADL can be suppressed by defining operator overloads as class members or as hidden friends. The latter is achieved by declaring all corresponding overloads as friend functions in the class that take the class’ objects as arguments. Generic base classes can provide mix-in facilities for hidden friends by taking the argument type that is the derived class as template parameter.

6.20.2 Avoidance mechanismsfor language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.21 Namespace Issues [BJL]

6.21.1 Applicability to language

The vulnerability described in ISO/IEC TR 24772-1:2019 clause 6.21 exists in C++. It can occur in particular when a used library changes its API. The situations where it exists are related to the following cases:

In the case of template specialization or non-identical definitions of the same entity in different translation units (ODR-violation), ill-formed code might be the result, however, a C++ compiler is not obliged to diagnose that situation, leading to undefined behaviour.

In the case of overloading and overriding cases, C++ compilers are required to diagnose an ambiguity if it exists.

However, overload resolution applies preference rules in order to select among multiple matching functions or function templates as a means to resolve the ambiguity among these functions. Hence, for calls that are not perfect matches, the user cannot guarantee in the presence of later changes which function is called, as another, better match can be introduced subsequently. The call in question then changes its binding without warning upon its next compilation. For cases, where the preference rules do not resolve the ambiguity, the resulting error message by the compiler avoids the vulnerability. Function template specializations are not considered during overload resolution, only the base template is considered.

void foo (long);

// void foo (int);
  
void bar ()
{
  foo (0);         // The call to 'foo(long)' requires implicitly conversion
                   // from 'int' to 'long'.   The function 'foo(int)' 
                   // would be a "better match" and so would silently
                   // be chosen when subsequently introduced
}

A new declaration can impact existing code in a number of situations involving the addition of:

A using directive broadens the possible scopes that will be examined for names during lookup. Where lookup searches a namespace referred to by a using directive, all names in that namespace will be visible some of which may be unwanted. A using declaration, on the other hand, declares only the specified name into the scope of the using declaration.

namespace NS1
{
  void f1 (int);
  void f2 (int);        // Added later
}
namespace NS2
{
  using namespace NS1;  // 'f1' needed
  void f2 (long);
  void bar ()
  {
    f1(0);              // Calls 'NS1::f1'
    f2(0);              // Unintentionally calls 'NS1::f2'
  }
}
namespace NS3
{
  using NS1::f1;        // 'f1' needed
  void f2 (long);
  void bar ()
  {
    f1(0);              // Calls 'NS1::f1'
    f2(0);              // Calls 'NS3::f2' as expected
  }
}

Overload resolution only considers conversions for the explicitly specified arguments and does not take default parameters into account:

void f1 (short, int = 0);
void f1 (int, short = 0);

void f2 ()
{
  f1 (1);       // calls 'f1(1, 0)' as '1 -> int' is better match than '1 -> short'
  f1 (1, 0);    // ambiguous, ill-formed, won't compile
}

The following example demonstrates a situation where the late addition of a better matching overload causes a silent change in the semantics of an existing program.

namespace NS
{
  struct A
  {
  };

  template < typename T > T foo ( T t )
  {
    return t;
  }
}

namespace NS2 // separately developed and included from a header file
{
   struct B
   {
   };
// This code will be added later
//  template < typename T > T * foo ( T * t )
//  {
//    return t;
//  }
}

using namespace NS2;
using namespace NS;

void bar()
{
  A * a;
  B * b;
  foo (a); // After the commented-out code is added to NS2, the binding of foo changes silently from NS::foo to NS2::foo
}

This issue can be avoided by avoiding using namespace xxx{.cpp} and explicitly qualifying each call, such as NS::foo(a){.cpp}.

Analogously, when a more specialized template is added to an imported namespace where the more general template has already been provided in another namespace, preference rules will silently prefer the more specialized template.

A similar situation can occur when a conflict arises between compiler-synthesized or rewritten operators and explicitly created versions of those operators, as in the following example.

struct A
{
    bool operator==(A const &) const { return true; }   
};

// Evil hijacking of !=
// bool operator != (A const &, A const &) { return true; } // #1

void bar (A const & a) {
  a != a;                                                    // #2
}

In the above example, the declaration of operator== will have a corresponding synthesised operator!= generated by the compiler, since there is no suitable user-declared !=. If the operator!= becomes visible, then the code at #2 uses the user-declared operator!= instead of the synthesized one, which can lead to a silent and unexpected change of behaviour. This is particularly risky when the operator is declared outside of the immediate visibility of the original definition.

6.21.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.22 Missing Initialization of Variables [LAV]

6.22.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 exists in C++, however, defining variables only when they can be initialized properly with an initializer in their definition avoids reading uninitialized memory.

Defining/allocating objects of trivial type with automatic/dynamic storage duration without initialization leaves the object with an indeterminate value. A subsequent read of such a variable before it has been written is undefined behavior. In addition, sub-objects of trivial type that are omitted in a constructor’s member initializer list and not initialized by the constructor’s body or by a default member initializer will not be initialized by that constructor. For example, the following class definitions suffer from incomplete initialization of subobjects, even though class test defines a default constructor:

struct base { short num; };                                                     
struct test : base                                                              
{                                                                               
  enum E1 { e1a=100, e1b, e1c };                                                
  int one; 
  int two; 
  double ar1[2]{ 1.1, 2.2 }; 
  double ar2[2]; 
  E1 e1; 
  E1 e2;                 
  test() 
  : // base unintialized 
  one{ 1 }
  // two uninitialized
  // ar1 initialized through default member initializer
  // ar2 uninitialized
  , e1{} // initializes to zero, not a named enumerator value 
  // e2 uninitialized
  { }                                  
};                                                                              

Dynamically allocating memory for an object using malloc, or some other C-style equivalent, does not initialize the object. Interpreting such memory as an object with trivial type will result in it having an indeterminate value.
Objects with non-trivial type require running a constructor for its lifetime to start correctly. In both cases attempting to cast a pointer to the allocated memory and using the object is undefined behavior except for special sanctioned cases, see Conversion Errors [FLC] and Pointer Type Conversion [HFC].

Non-local variables with static storage duration that are dynamically initialized can cause undefined behavior if the initialization depends on other such variables. If the dependency is in the same translation unit the sequencing is defined in definition order, however, there is no sequencing guarantees across translation unit boundaries and thus, undefined behaviour can occur by accessing an uninitialized variable. For example:

struct A {
  A (int i ) : i_ { i }  {  }
  int i_;
};
struct B {
  B (A const & a) : j_{a.i_} { }
  int j_;
};
extern A a;  // declare existance of variable 'a'
// defining variables with dynamic intialization:
B b { a };   //  #1
A a { 42 };  //  #2

If #1 and #2 are in the same translation unit, then a in #1 is incompletely initialized (zero initialized). If #1 and #2 are in different translation units, then the order of initialization of a(#2) relative to b(#1) is indeterminate.

The constexpr-specifier for a variable ensures initialization at compile time. The constinit specifier ensures a variable is initialized at compile time, even if it is non-const.

Defining non-member variables as const or as constexpr, enforces initialization by the compiler and makes reasoning about code easier.

If determining the initial value of a variable requires complex logic, putting that logic into an immediately-invoked lambda expression that computes the initial value, permits the variable to be initialized when defined.

See C++ Core Guidelines ES.20 and CERT C++ Coding Guidelines EXP53-CPP. Note that ES.20 and EXP53 are complementary. Both point out that you should always initialize before reading, but ES.20 uses the narrow sense of initialize while EXP53 includes assignment.

6.22.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.23 Operator Precedence and Associativity [JCW]

Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.23 is applicable to C++.

Operator precedence and associativity in C++ are determined by the C++ grammar. There are four operators that cannot be overloaded (user-defined):

Due to the large number of operators, one is recommended to consult an operator precedence table when needed, e.g., [https://en.cppreference.com/w/cpp/language/operator_precedence]. For example, in C++, the bitwise logical and shift operators are sometimes incorrectly treated as having the same precedence as arithmetic operations even though the bitwise operators have lower precedence. For example, the following (correct) expression subtracts one from x and then checks if the result is zero:

x - 1 == 0

which is equivalent to (x - 1) == 0,i.e., x - 1 is done first, then that result is compared to 0. Programmers mistakenly thinking the bitwise operations have the same precedence as arithmetic ones might write:

x & 1 == 0

intending to perform (x & 1) == 0, but precedence rules result in this evaluating x & (1 == 0) instead. This would have been easily fixed by using parenthesis to ensure the proper evaluation of an expression.

In addition to the aforementioned, C++ also permits operators to be overloaded when used with user-defined types. While it is not possible to change the precedence, associativity, and number of operands of overloaded operators [C++17, Clause 16.5 [over.oper], para. 6], overloaded operators can be executed differently than built-in operators. For example, overloaded operators lose any built-in operator short-circuiting properties and sequence order guarantees. Similarly overloaded operators and their arguments' evaluations behave as normal function calls, differing from built-in operator evaluation.

struct A {  };
bool operator&&(A const &, int);
int foo ();

void bar (A const & a)
{
  if (a     && foo());  // 'foo()' always evaluated
  if (false && foo());  // 'foo()' never evaluated
  if (a.operator&& (false,foo())); // 'foo()' always evaluated
}

Note that overloaded assignment falls into this category.

For issues related to the declaration of equality and relational operators see Clause 6.25 [KOA].

6.23.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.24 Side-effects and Order of Evaluation of Operands [SAM]

6.24.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.24 exists in C++.

The evaluation of an expression includes: (i) its value computation; and (ii) its side-effects. The value computation is the value returned by the expression, e.g., the valuation of 3 * 2 + 1 is 7. The side-effect of an expression are

For example consider:

int i = 2;
int j = i++;

the evaluation of i++ is 2 and the side-effects are the writing of 3 to i and the initialization of j.

Within an expression, one must ensure an object is stored only once to avoid undefined behaviour, e.g.,

i = i++ + 5; // undefined behaviour (before C++17)

or

k = i++ + i\--; // undefined behaviour in all versions of C++

and expressions modifying objects can only read the object to determine the value to be stored (e.g., ++i requires reading the value), i.e., other accesses are undefined behaviour, e.g.,

my_array\[i\] = i++; // undefined behaviour (before C++17)

Starting with C++17, the evaluation order of an expression involving
overloaded operators preserves the sequenced before behaviour of the
built-in operator:

```{.cpp}
my_array[i] = i++;
my_array[i++] = i++;

say i = 10 before the expression

evaluate RHS i++i is 11

evaluate my_array[i++] //evaluates my_array\[11\], then assigns i to 12

my_array[11] is assigned 10

This occurs because assignment is sequenced after the value computation of the right and left operands and before the value computation of the assignment expression and, the right operand is sequenced before the left operand. [C++17, Clause 8.18 [expr.ass], para. 1] Since this is the built-in operator, this statement can be thought of as:

Compute value of right-hand-side: i++ (e.g., integer value).

Compute value of left-hand-side: my_array[i] (e.g., memory address).

Apply side-effects of i++.

Apply side-effects of the assignment.

In general, one should follow commonly-stated C/C++ advice of never reading from and writing to the same object within an expression to avoid potential vulnerabilities. Often breaking the expression into separate statements achieves clear and clean semantics, e.g.,

++i;
my_array[i] = i;

or

my_array[i] = i;
++i;

makes it unambiguous what the value of i is during the array assignment and eliminates the possibility of vulnerabilities.

In addition, it is important to note that overloading an operator disables short-circuiting behaviours (e.g., built-in boolean operators): those operators' operands are all evaluated before the operator itself.

The C++ built-in (two-argument) Boolean operators (e.g., && and \|\|)as well as <type_traits>’s std::conjunction and std::disjunction operations are all short-circuiting, i.e., if the value of an earlier (from left-to-right) operand of an operation determines the result of the operation, then all remaining arguments are not evaluated.

<!--
Conjunction and disjunction operate at compile time and the short-circuiting is about template 
instantiations that might lead to compile errors otherwiese. This is not a runtime safety issue. I 
suggest dropping that (Peter)_
-->

Typically this allows one to write code like this, e.g.,

  int *p;
  // ...
  if (p != nullptr && *p != 0) {
    /* do something */
  }

i.e., if p is nullptr, then *p != 0 is never executed, thus, avoiding undefined behaviour. Only when p is not nullptr is *p != 0 is evaluated. It must be stressed that this only applies to the built-in && and || operators: user-defined operator overloads as functions always evaluate all operands first.

Consequently should one want to always evaluate all operands of a boolean expression, one should not write code like this:

bool x = foo() && bar();

where foo() and bar() are functions that return something convertible to bool. In this expression, if foo() returns false, then bar() will never be executed; –only when foo() returns true will bar() be executed. Similarly for ||:

bool y = foo() || bar();

i.e., only when foo() returns false will bar() be executed if foo() returns true then bar() will never be executed. Thus, if both foo() and bar() are both required to be executed, then execute them in separate statements first, e.g.,

bool foo_result = foo();
bool bar_result = bar();
bool x = foo_result && bar_result;
bool y = foo_result \|\| bar_result;
<!--
Stephen: My write-up here is lengthy but should help get more terse
wording\... but I note this: C++ operator information is in C++17 Clause
8 and Clause 16.5, \... Also per 16.5.1 para 2. unary and binary forms
of the same operator are considered to have the same name so one can
hide another from an enclosing scope. Thus, this is also another
possible vulnerability.\]
-->

6.24.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.25 Likely Incorrect Expression [KOA]

6.25.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.25 exists in C++.

C++ has several instances of operators which are similar in structure, but different in meaning. Examples of operators in C-based languages that can cause confusion are:

The typographical similarity can lead to code like the following, where it is unclear if the expression as spelled is actually intended, or if the author has typos in it, meaning a different operator instead:

auto f(unsigned i, unsigned j)
{
  return (i > 1) & (j = 1); // (>>, &&, ==)?
}

The following code in a production phone OS caused the “bricking” of many users phones:

if (key_data_.has_value() & !key_data_->label().empty())

instead of

if (key_data_.has_value() && !key_data_->label().empty())

or the even clearer using the alternative operator representation and for &&

if (key_data_.has_value() and !key_data_->label().empty())

As a general rule, the use of =, +=, -= in an expression when the operator is not the final assignment to a variable is unsafe since the assignment operator creates side-effects within the expression which are difficult to analyze by a human reader and can be have different results depending upon the order of evaluation of terms within the expression.

But even in assignment expression flipping the assignment symbol with the operator can itself lead to valid code that was not intended:

int i{42};

i += 22; // i becomes 64
i =+ 22; // i becomes 22
i =- 22; // i becomes -22

C++ provides significant freedom in constructing statements. This freedom, if misused, can result in unexpected results and potential vulnerabilities.

Since the order of evaluation within expressions is only partially defined, sub-expressions with side effects on variables used within the overall expression can result in undefined behaviour.

The flexibility of C++ can obscure the intent of a programmer. Consider:

int x,y;
/* ... */
if (x = y){
  /* ... */
}

A fair amount of analysis may need to be done to determine whether the programmer intended to do an assignment as part of the if statement (valid in C++) or whether the programmer made the common mistake of using an = (assignment) instead of a == (equality).

This confusion can be corrected by moving assignments outside of Boolean contexts. This would change the example code to:

int x,y;
/* … */
x = y;
    if (x == 0) {
     /* ... */
    }

This would clearly state what the programmer meant and that the assignment of y to x was intended.

Additional confusion occurs in the use of the logical && or || operators and the bitwise & or | operators. The compiler will implicitly convert arithmetic expressions to bool for operands of the logical operators. Similarly, operands of bool type will be promoted to int for operands of the bitwise operators (see Conversion Errors [FLC]).
It may not be clear whether the programmer intended to use the logical operator && or bitwise operator & instead:

unsigned f(unsigned i, unsigned j)
{
  return (i > 0) & j;
}

Using the alternative tokens and / or in lieu of && and || reduces the possibility of confusion. Similarly, a not_eq b is preferable to a != b since the latter is easily confused with the equally valid expression a |= b.

Programmers can easily get in the habit of inserting the ; statement terminator at the end of statements. However, inadvertently doing this can drastically alter the meaning of code, even though the code is valid as in the following example:

int a,b;
    /* … */
    if (a == b);  // the semi-colon will make the following code always execute
    {             
     /* ... */
    }

Because of the misplaced semi-colon, the code block following the if will always be executed. In this case, it is extremely likely that the programmer did not intend to put the semi-colon there.

Unary ‘+’{.cpp} on a variable is (almost) a no-op, and is possibly a mistype of ‘++’{.cpp}. A unary ‘-’{.cpp} on a variable will switch its sign, unless applied to a variable of an unsigned type, in which case the result is the value subtracted from 2^n where n is the number of bits in the unsigned type.

C++ overloading of operators can also cause confusion.

The language does not impose any restrictions on semantics of overloaded operators. This can cause (potentially generic) code to behave in completely unobvious ways, when such types with “unusual” operator semantics are used.

For example, the boost.spirit library allows code like the following to create parser rules:

r = real_p >> *(ch_p(',') >> real_p); // rule that accepts a comma-separated list of real numbers

This library uses C++ operator overloads to create an embedded domain-specific language for grammar rules, allowing the specification of parser rules as C++ expressions.

When overloaded, related operators like the compound assignment with their base operator are not longer guaranteed to keep their behavioral relationship that they have for built-in types. For example, a += b is not guaranteed to behave like a = a + b, or being defined at all.

Similarly for overloaded relational operators, for a == b, there is no guarantee that a != b is equivalent to !(a == b) if both are overloaded by the user.

Unless all relational operators for a type are defined either explicitly in a consistent way or implicitly, unexpected results can occur. A user-declared three-way comparison operator (<=>) is used by the compiler to synthesize the relational operators consistently. If operator<=> is defined as =default, the equality comparison operators will also be defined; and if operator== with return type bool is defined, a corresponding inequality operator!= is also defined implicitly.

6.25.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

— Use the avoidance mechanisms of ISO/IEC 24772-1 clause 6.25.5.

6.26 Dead and Deactivated Code [XYQ]

6.26.1 Applicability to language

The vulnerability as documented in ISO/IEC 24772-1 clause 6.26 exists in C++.

The language mechanisms around templates and overload resolution can require definitions to exist that are not part of the executable program. But the mechanisms at compile time guarantee that the corresponding code never becomes part of the executable program. However, a programmer might be unaware of all details with respect to the language mechnisms and thus make subtle errors leading to code selected for the executable program that was unintended.

If there is code that was once needed or might be needed in the future, programmers might opt to comment or use preprocessor conditional compilation to exclude such parts. The latter might even be confusing, because an intentionally undefined macro might be defined for a specific compilation outside of the program source text. Modern version control systems are better places to keep unused code in a revision or branch and ressurect it if needed through a merge.

6.26.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.27 Switch Statements and Static Analysis [CLL]

6.27.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.27 exists in C++.

Because of the way in which the switch-case statement in C++ is structured, it can be relatively easy to unintentionally omit the break statement between cases causing unintended execution of statements for some cases.

The switch statement has the form:

    int abc = someExpression();
    /* … */
    switch (abc) {
       case 1:
          sval = “a”;
           break;
       case 2:
           sval = “b”;
           break;
       case 3:
           sval = “c”;
           break;
       default:
           throw SomeException();
    }

If there isn’t a default case and the switched expression doesn’t match any of the cases, then control simply shifts to the next statement after the switch statement block. Unintentionally omitting a break statement between two cases will cause subsequent cases to be executed until a break or the end of the switch block is reached. This could cause unexpected results.

The attribute [[fallthrough]] expresses the programmer’s intent that the code where it is placed is intended to fall through. If this attribute is not used, compilers typically diagnose the absence of a break statement.

6.27.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

See also the C++ Core Guidelines ES.78

6.28 Demarcation of Control Flow [EOJ]

6.28.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.28 exists in C++.

C++ lacks a keyword to be used as an explicit terminator. Therefore, it may not be readily apparent which statements are part of a loop construct or an if statement.

Consider the following sections of code:

    int foo(int a, const int *b) {
        int i=0;
        // . . .
        a = 0;
        for (i=0; i<10; i++); // notice the ';' !!
        {
            a = a + b[i];
        }
        int c = 0;
        int x = 0;
        for (int j=0; j<10; j++)
            c = c + b[j];
            x += c; 
    }

At first it may appear that, after the first loop, a will be a sum of the numbers b[0] to b[9]. However, even though the code is laid out so that the a = a + b[i] code appears to be within the for loop, the “;” at the end of the for statement causes the loop to be on a null statement (the “;”) and the

a = a + b[i];

statement to only be executed once. Similarly, the indentation leads us to believe that that assignment to x is part of the second loop, but it is not. These mistakes may be readily apparent during development or testing. More subtle cases may not be as readily apparent leading to unexpected results.

if statements in C++ are also susceptible to control flow problems since there isn’t a requirement in C++ for there to be an else statement for every if statement. An else statement in C++ always belong to the most recent if statement without an else. However, the situation could occur where it is not readily apparent to which if statement an else belongs due to the way the code is indented or aligned.

Similar issues arise for if-statements, particularly during maintenance, for example:

```{.cpp}
int a,b,i;
// . . . 
if (i == 10){
       a = 5;       
       b = 10; // added later, but correct since within the {…}
      }
      else 
          a = 10;   
          b = 5;    // added later, intended to be part 
                        // of the else clause
```

If the assignments to b were added later and were expected to be part of each if and else clause (they are indented as such), the above code is incorrect: the assignment to b that was intended to be in the else clause is unconditionally executed.

6.28.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

See also the C++ Core Guidelines ES.85, ES.71, ES.74, ES.1 and ES.2

6.29 Loop Control Variables [TEX]

6.29.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.29 exists in C++.

C++ allows the modification of loop control variables within non range-based loops. This is usually not considered good programming practice as it can cause unexpected problems. The flexibility of C++ expects the programmer to use this capability responsibly.

Since the modification of a loop control variable within a loop is infrequently encountered, reviewers of C++ code may not expect it and hence miss noticing the modification. Modifying the loop control variable can cause unexpected results if not carefully done. In C++, the following is semantically correct, but is error-prone:

int a;
for (int i = 1; i < 10; i++){
    ...
    if (a > 7)
        i = 10;
    ...
}

which will cause the for loop to exit once a is greater than 7 regardless of the number of iterations that have occurred.

for (int i : std::ranges::iota_view{1,10})
{
    if (a > 7) {
       i = 10;   // This changes the local variable for this loop iteration's execution
                 // but subsequent iterations are not affected
       }
    }
    
for (int const i : std::ranges::iota_view{1,10})
{
    if (a > 7) {
       i = 10;   // This is now illegal since the 'const int' prevents assign
       }
    }

The range for example immediately above does not have the vulnerability of the C-like for loop above.

for (int i=1; i < 10; ++i) ...

In a range-based for loop, the control variable is not available.

std::array a {3, 1, 4, 1, 5};
for (auto const x : a) {
    std::cout << x << '\n';
}

6.29.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

Note: See also the C++ Core Guidelines ES.71, ES.86.

6.30 Off-by-one Error [XZH]

6.30.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.30 exists in C++.

Arrays are a common place for off by one errors to manifest. In C++, arrays are indexed starting at 0, causing the common mistake of looping from 0 to the size of the array as in:

int foo() {
    int a[10];
    int i;
    for (i=0, i<=10, i++)
      ...
    return (0);
}

C++ mitigates the issue of sentinel values in strings document in ISO/IEC 24772-1 clause 6.30 by providing the string class and the string_view class.

C++ does not flag accesses outside of array bounds, so an off by one error may not be as detectable in C++ as in some other languages. Several good and freely available tools can be used to help detect accesses beyond the bounds of arrays that are caused by an off by one error. However, such tools will not help in the case where only a portion of the array is used, and the access is still within the bounds of the array.

C++ mitigates these issues by providing

6.30.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

See also the C++ Core guidelines ES.1, ES.42, ES.71, SL.con.3 (more to come)

6.31 Structured Programming [EWD]

6.31.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.31 exists in C++.

It is as easy to write structured programs in C++ as it is not to. C++ contains the goto statement, which can create unstructured code. It also has continue, break, and return that can create a complicated control flow, when used in an undisciplined manner. Spaghetti code can be more difficult for static analyzers to analyze and is sometimes used on purpose to intentionally obfuscate the functionality of software. Code that has been modified multiple times by an assortment of programmers to add or remove functionality or to fix problems can be prone to become unstructured.

Because unstructured code in can cause problems for analyzers, both automated and human, of code, problems with the code may not be detected as readily or at all as would be the case if the software was written in a structured manner.

In C++, the break and continue operations only act on the innermost loop. At times, escape from nested loops is required. In such cases, the use of goto may be simpler and easier to verify than a series of tests with break and/or continue operations.

The setjmp macro sets the current execution context into a variable, which can be use later to return to that current context using longjmp call. These calls originated from the C standard library to mimic goto across the call stack. They do not support the relevant additions to C++ such as destructors for automatic objects, exceptions, and concurrency, and hence are incompatible with modern C++ programming.

A coroutine is a function that can suspend execution for later resumption (optional).

6.31.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

See also the C++ Core guidelines ES.76, ES.77, SL.C.1

6.32 Passing Parameters and Return Values [CSJ]

6.32.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.32 exists in C++. However, the language also provides appropriate mitigation.

C++ provides both call by copy (aka call by value) and call by reference parameter passing. The argument is evaluated to initialize the formal parameter (in the first case) or bound to the formal parameter (in the second case) of the function that is being called. A formal parameter behaves like a local variable, however, changes to a non-const reference parameter affect the bound object.

The rich type system of C++ allows types that when passed by value still have call by reference semantics, for example, pointer types, std::reference_wrapper, or class types with pointer or reference member variables.

C++ assumes that pointer or reference parameters of different types never alias, even if the underlying object representations are identical, i.e., for a function declared as void f(int *pi, long *pl) the compiler will assume that pi and pl always refer to different objects, even if sizeof(int) == sizeof(long). Two parameters may refer to the same object if they have pointer or reference type, and the target types are the same or related. This means, aliasing between reference parameters or with a reference result needs to be taken into account in user code. For example, in an assignment expression the left and right hand side can refer to the same object. This implies that user-defined assignment operators must take precautions against self-assignment or document that it is forbidden.

Modern C++ ensures that in many cases the need for and overhead of copying value arguments or results is elided by the compiler, especially from temporary objects.

The use of const lvalue-reference parameters combines the efficiency of call by reference with the guarantee that the underlying input parameter is not changed (marking it as an in parameter). A non-const reference parameter must be considered an inout parameter. Rvalue-reference parameters are inout parameters that allow transfer-of-ownership semantics. At their call site it is best to assume that the argument object is in an indeterminate state and has to be reassigned before subsequent use. There is no language mechanism for marking out parameters, one would use the return mechanism. Instead of multiple out parameters a struct, std::pair, or std::tuple can be used as a return type and eventually decomposed at the call site to its constituents via a structured binding.

Member functions take the *this object as an implicit reference parameter. The kind of reference can be specified through qualification of the member function. However, in addition to lvalue-reference, const-lvalue-reference, and rvalue-reference qualification, there exists an oddity with respect to normal reference parameters:

This means, unqualified member functions are callable on temporaries (rvalues) and thus can have side effects, but also can return an lvalue-reference to said temporary by returning *this (or members of *this), which can lead to dangling if such a reference is used beyond the expression of the function call returning it. For example, the compiler-provided assignment operators of a class are unqualified member functions that return an lvalue-reference to *this.

Rvalue-reference parameters in a context where their actual type is deduced from the call site, are called forwarding references. A forwarding references will either be deduced to an lvalue-reference or an rvalue-reference depending on the argument at the call site.

Aliasing is expected and allowed in some cases, such as:

-   Assignment and compound assignment operators: the right parameter may alias the left parameter. The function result always refers to the left parameter unless overloaded differently. In the case of self-assignment the stored value should not change.

-   Functions that `swap`{.cpp} their parameters: The two parameters to be swapped may refer to the same object.

-   Shift operators used for input and output: the result always refers to the left parameter.

-   Prefix increment and decrement operators: the result always refers to the parameter.
    

The C++ preprocessor macros use a call by name parameter passing; a call to the macro replaces the macro by the body of the macro. This is called macro expansion. Macro expansion is applied to the program source text and amounts to the substitution of the formal parameters with the actual parameter expressions. Formal parameters are often parenthesized to avoid syntax issues after the expansion. Call by name parameter passing reevaluates the actual parameter expression each time the formal parameter is read.

6.32.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.33 Dangling References to Stack Frames [DCM]

6.33.1 Applicability to language

The vulnerability as expressed in ISO/IEC TR 24772-1:2019 and ISO/IEC TR 24772-3:2020 C exists in C++ by indirect access to variables with automatic storage duration or to temporary objects.

The lifetime model of C++ makes it undefined behaviour (see subclause [EWF]) to access an object outside of its lifetime. This results in undefined behavior, when an object access is attempted after its destruction. C++ provides a rich set of pointer-like types whose values may refer to temporaries or variables with automatic storage duration and can dangle (see Subclause [XYK]).

A C++ class type with a pointer-like member will behave as a pointer-like type, unless the class itself manages the lifetime of the object referred to by its member.

In general, any caller storing the pointer-like object returned from a function call risks dangling; such situations require thorough lifetime analysis to ensure that access via the pointer-like object doesn’t dangle.

The efficiency of return-by-value, copy-elision, and move-semantics as specified by C++ reduces the incentive to return a pointer-like type from a function or bind a temporary to a local reference.

The lifetime of a temporary object usually ends at the end of a full expression where it was created. Dangling can occur, when an expression including the creation of a temporary object results in a pointer-like value referring to the temporary object. For example, std::max returns the const-reference given as parameter, which might be bound to a temporary argument:

int g(int i){
    int const &m = std::max(i,20);
    return m; // access dangling reference to temporary if i < 20
}

In some situations binding a reference to a temporary will extend the lifetime of the temporary.

This lifetime extension is not transitive across function calls, therefore, changes in the code, such as replacing a data member access with an accessor member function, can silently lead to dangling in such lifetime-extension situations.

struct A{
    int a;
    int const &getA(){return a;}
};
void h(){
    int && ra = A{42}.a; // lifetime extended
    int const & cra = A{42}.getA(); // dangling
}

The range-based for statement contains a subtle situation with lifetime extension.
A temporary in the range expression will have its lifetime extended, unless it is accessed indirectly. As a mitigation C++ permits the creation of a variable for such situations that has the scope of the range-for loop, as shown in the following example:

extern std::vector<std::string> make(); // creates a vector

for(char c : make().front()) { // attempt to iterate over first string in vector
   // vector and thus contained string is already destroyed before C++23
}

for(auto range = make().front(); char c : range){ // mitigation, create a variable for the range to be iterated over
  // string to be iterated over remaings valid throughout
}

This issue is no longer present from C++23 onwards, as temporaries within the for-range-initializer are lifetime extended until the end of the statement.

Returning a pointer-like object from a function is problematic, if the return value refers to a temporary or an object with automatic storage duration, either directly or indirectly. The following example show different situations with this problem:

int *bad_pointer() {
  int a = 0;
  return &a;      // Returning the address of a local variable "a".
 }

int& bad_reference(int b) {
  return b;      // Returning a reference to a local (parameter) variable "b" .
 }

std::array<int,3>::iterator bad_iterator() {
  std::array<int,3> c = { 1, 2, 3 };
  return c.begin();
  // Returning an iterator that refers the first element of the local array "c".
}

auto bad_lambda() {
    int d = 0;
    return [&] { return d = 1; };
    // Returning a lambda that captures local variable "d" by reference
    // and thus indirectly returns a reference to the local variable
}
decltype(auto) bad_assign(){ // deduces: std::string &
    return std::string{} = "hello\n"s;
    // Returns reference to temporary object returned from copy-assignment operator
}

void erroneous_use() {
  std::cout << *bad_pointer();
  std::cout << bad_reference(42);
  std::cout << *bad_iterator();
  std::cout << bad_lambda()();
  std::cout << bad_assign();
 }

In the examples above, the function bad_assign returns a std::string & that was itself returned from the copy-assignement operator of std::string. Such an assignement operator (including the compiler-provied ones) can be called with a temporary as its left-hand operand, because it is an unqualified member function (for historical reasons).

Dangling may occur by calling a member function on a temporary that returns a pointer-like object referring to *this, a sub-object of *this, or an object managed by *this. This can be prevented by - For a non-const member function: adding an lvalue ref-qualification (&), - For a const member function: adding an lvalue ref-qualification (const &) and declaring an rvalue ref-qualified overload (&&) either defined as =delete or declared to return by value.

In the following example, class nta declares its copy assignment with lvalue ref-qualification to avoid the situation created in the example function bad_assign:

struct ta{}; // default allows assignment to temporary
struct nta{
nta & operator=(nta const &) & = default; // lvalue-ref qualified
};
ta & check_ta(){
    return ta{} = ta{}; // returns dangling reference to temporary
}
nta & check_nta(){
    return nta{} = nta{}; // won't compile
};

Referring to a variable with automatic storage duration from a pointer-like variable with static or thead-local storage duration usually means dangling, when the indirect access happens.

int const init{42};
std::reference_wrapper<int const> bad_ref = init; // static storage duration
void bad_global_assign(){
    if (bad_ref == 42){ // undefined behavior on 2nd call
       int local{44};
       bad_ref = local; // Any further access of bad_ref dangles
    }
}

A class type with pointer-like members can lead to dangling when those members refer to constructor arguments.

struct X{
int const &rci;
X(int i):rci{i}{} // No lifetime extension of parameter object by binding reference to it
};

Similarly, in the following example the vulnerability exists in the conversion operator string_view() of std::string, that returns a pointer-like type from a member function callable on a temporary object.

std::string_view bad_var("a string"s); // dangling view on temporary string object

6.33.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.34 Subprogram Signature Mismatch [OTR]

6.34.1 Applicability to language

In general, there must be a match between the number of parameters in a function call and the number of arguments in the function definition. For issues related to macro signatures, see subclause Pre-processor directives[NMP].

The number of arguments can be different to the number of parameters in a function where: - a function template includes a function parameter pack, or - a function parameter includes a default argument, or - a function parameter-declaration-clause ends with an ellipsis, f(\...).

Calling a function template with a function parameter pack results in a specialization of the function with the parameter types matching the corresponding argument types.

The compiler will ensure for variadic templates that the type and number of arguments is correct.

A call to a function with default arguments can provide fewer arguments than parameters as long as the parameters for which no explicit argument is provided include a default argument.

Where a function parameter-declaration-clause ends with an ellipsis, additional arguments can be accessed through the mechanisms provided by <cstdarg>. No information about the number or types of the parameters is supplied by the compiler. The use of this feature outside of special situations can be the basis for vulnerabilities.

Undefined behavior can arise, for example:

    #include <cstdarg>

void f1 (int cnt, ...)
{
  va_list ap;
  va_start (ap, cnt);

  short i = va_arg (ap, short);  // Invalid type

  va_end(ap);
}
#include <cstdarg>

void f1 (int cnt, ...)
{
  va_list ap;
  va_start (ap, cnt);

  int i = va_arg (ap, int);
  int j = va_arg (ap, int);

  va_end(ap);
}

void f2 ()
{
  f1 (1, 2, 3);  // OK
  f1 (1, 2);     // results in undefined behaviour
}

These issues cannot occur where default arguments or variadic function templates are used.

The C++ Name mangling ensures that function signatures match accross translation units.

This does not apply to other mangling schemes. For example, parameters do not form part of the mangled name for functions declared with the extern "C" linkage specification. Thus such a function can be invoked with incorrect parameter types due to an incorrect redeclaration of the function:

// library.cc
extern "C" void foo (unsigned, unsigned)
{
  // ...
}
// main.cc

extern "C" void foo (unsigned);

int main ()
{
  foo (0xffffffff);  // Calling function that is
                     // defined to take 2 parameters
}

6.34.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

Note: See also C++ Core Guidelines F.55.

6.35 Recursion [GDL]

6.35.1 Applicability to language

C++ permits recursion, hence is subject to the problems described in ISO/IEC 24772-1 clause 6.35.

C++ allows recursive constexpr functions and consteval functions that are evaluated at compile time where such calls don’t contribute to the vulnerability.

6.35.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can follow the avoidance mechanisms of ISO/IEC 24772-1 clause 6.35.5.

6.36 Ignored Error Status and Unhandled Exceptions [OYB]

6.36.1 Applicability to language

The vulnerabilities described in ISO/IEC 24772-1:2019 clause 6.36 exist in C++, however, C++ provides a mitigation.

C++ includes the C library, especially the header <cerrno> and thus shares C’s issues with the global error-reporting variable errno. See ISO/IEC TR 24772-3:2020, clause 6.36 for details and guidance.

In addition to errno some C++ library features expose error conditions indirectly via a side-effect on the object the operation failed with or via side-effect on a reference parameter. For example, input stream objects will go into a fail state, when formatted input cannot be performed. Without resetting that fail-state of a stream, further input will continue to fail, that can cause further failures when a failure is ignored. Streams provide a non-default mode to throw exceptions on failure. Another example of error reporting via a side-effect is the filesystem library that provides overloads that take a non-const reference of std::error_code.

In general, reporting errors as side-effects, in the worst case via a global state, is too easy to accidentally ignore by developers, leading to further consistency problems in the continued execution of the program.

By default, C++ has the C weakness of permitting the call to a function that returns an error code without capturing the return value in a variable.

errnum foo( int a, int b);
. . .
foo(x, y); // failure to capture the return error code.

C++ offers as a mitigating mechanism the [[nodiscard]] attribute. This attribute indicates that the function result must not be discarded. Ignoring the result of a function marked [[nodiscard]] causes a compiler warning.

[[nodiscard]] errnum foo( int a, int b);
. . .
foo(x, y);  // compile error.

if( auto e = foo(a,b); e == 0) { // no compile error
// success
}
else {
// handle errors
}

In addition, the C++ library provides mechanism to extend the return type of a function with extra values for denoting an error. The simplest case is std::optional<T> that extends T with an “empty” state. Callers must check the result of functions returning an optional for the empty state, before accessing its value. This increases the chances that a reported error is detected by the caller. If additional information of the error has to be returned to the caller std::expected<T,AnErrorCode> or alternatively std::variant<T,AnErrorCode> can be used.

C++ offers a set of library-defined exceptions for error conditions that may be detected by checks that are performed by the standard library. In addition, the programmer may define exceptions that are appropriate for their application. These exceptions are handled using an exception handler. Exceptions may be handled in the environment where the exception occurs or may be propagated out to an enclosing scope. Exceptions that are never handled in the program result in abnormal termination of the application. In this case, it is implementation-defined whether the destruction of local objects (stack unwinding) occurs. An unhandled exception that occurs in a thread also results in the abnormal termination of the application. See 6.62 Concurrency - Premature Termination [CGS] for issues related to thread or process termination. An exception propagating out of a coroutine causes the coroutine to end in an unresumable state and the exception is not further propagated.

6.36.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ developers can:

6.37 Type-breaking Reinterpretation of Data [AMV]

6.37.1 Applicability to language

The vulnerability as documented in ISO IEC 24772-1 clause 6.37 applies to C++. The language mechanisms where type-breaking reinterpretation of data can happen involve unions and std::bit_cast. For type-breaking reinterpretation involving pointers or references see 6.11 Pointer Type Conversions [HFC].

In C++, the issue of casting vulnerabilities are mitigated by the fact that the C++ named casts, i.e., reinterpret_cast, const_cast, static_cast, dynamic_cast, and std::bit_cast, are easily identified, e.g., by text search for more careful review. C-style casts do not share this property and therefore should be prohibited. In addition, C++ named casts include some compile-time checks, and in the case of dynamic_cast run-time checks, that help avoid some but not all vulnerabilities.

Of the available "_cast" operations, only std::bit_cast provides reinterpretation of data values, and only reinterpret_cast allow the reintepretation of data as a different type within limits. For uses of reinterpret_cast see 6.11 Pointer Type Conversions [HFC]. The named casts static_cast, dynamic_cast, and const_cast perform type conversions and not reintepretation of bits and thus don’t have this vulnerability but are subject to potential conversion errors (see 6.6 Conversion Errors [FLC]).

Reading a union member that was not previously written is undefined behaviour except for a few cases described by ISO/IEC 14882:2020 clause [class.mem.general]. The reinterpretation of data values via different union members that is common practice in C is undefined behaviour in C++. The type std::variant provides a similar mechanism to union but prevents reading an inactive member.

If there is no value of the target type corresponding to the bit representation of the source type’s value, using std::bit_cast is undefined behaviour, such as in the following example.

enum E { one, two, four=4 };
constexpr auto x = std::bit_cast<E>(42); // 42 is not representable by E

C++ also provides <type_traits>, such as std::is_layout_compatible, that can be used to ensure the legality of a specific std::bit_cast.

A legacy means of “bit casting” is the use of memcpy to transfer the bytes of an object’s representation to another object of a different type. Except for cases that are well-defined with std::bit_cast, such use of memcpy to reinterpret data values is undefined behaviour.

6.37.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.38 Deep vs. Shallow Copying [YAN]

6.38.1 Applicability to Language

The vulnerability described in ISO/IEC TR 24772-1:2019 clause 6.38 exists in C++ and only arises in C++ when there is a mismatch between the type’s copy semantics and the programmer’s intent.

On the language level, reference semantics, that can lead to shallow copies, usually requires the use of pointer or reference types, however, an integral type can also have reference semantics, for example, when it is used as an index or as an operating system handle.

Such types with reference semantics are also called relationship types and they will suffer from the aliasing problematic of this vulnerability and additionally from potential dangling due to expired lifetime of referred objects (see [XYK]).

In general, relationship types with an immutable referent, such as a const-reference, do not suffer the deep vs. shallow copying semantics problem, unless mixed with relationship types with a mutable reference to the same object. However, the lifetime of a const referent is still an issue to manage (see [XYK]).

The standard library type std::shared_ptr<T> has shallow copy semantics when the managed type is non-const, but in contrast to other relationship types guarantees the lifetime of the referent.

Class types that have relationship type members will become relationship types themselves, unless the class provides deep copy semantics or disables it and manages the lifetime of the referred object (manager type). Such relationship types and manager types will refer to their referred/managed resources via a data member with reference semantics.

A manager type defines a non-empty, non-deleted destructor in addition to providing appropriate copy and move operations. Examples of a manager types are the standard library container types such as std::vector that use pointers to the allocated space of their elements and copying a vector will also copy all contained elements not just the pointers. This management is achieved by replacing the compiler-provided copy-constructor and copy-assignment operator with implementations providing value semantics that perform the deep copy (general manager). An alternative to potentially expensive deep copies for manager types is the prevention of copying, either by defining move operations that transfer the ownership of a managed resources, like std::unique_ptr does (unique manager), or by preventing both copy and move operations (scoped manager), for example, by defining the move-assignment operator as deleted.

Without such replacement of copy and move operations a class type with relationship type members suffers from the potential confusion due to shallow copies. For example, the standard library types std::span, std::string_view, iterators, and the views of the standard ranges library are relationship types. Care must be taken to not only understand implications of their shallow copy semantics, but also about their validity depending on the lifetime of the referred ranges.

Using relationship types as function parameter types is usually safe, because language semantics guarantee the lifetime of parameter objects. Exceptions exists for thread functions and coroutines, where the initial calling context is not guaranteed to exist when parameters of relationship type are accessed.

Returning a relationship type from a function can be problematic, unless the lifetime of the referred object is clear. For example, returning a reference to a local variable will return a dangling reference (see [XYK] and [XYH]).

See also Core Guidelines C.20, C.22, C.32, C.67.

6.38.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.39 Memory Leak and Heap Fragmentation [XYL]

6.39.1 Applicability to language

The memory leak vulnerability documented in ISO/IEC TR24772-1:2019 clause 6.39 exists in C++, unless the programmer takes steps to avoid it. Using standard library containers sidesteps most memory leak issues described in that document.

See ISO/IEC TR 24772-3 for issues associated with the C functions malloc(), calloc(), realloc() and free(). Because of the issues with these functions, C++ users should refrain from using these functions wherever possible.

C++ has an additional vulnerability in that it provides multiple alternatives for allocation and deallocation.

Failing to match the deallocation to the corresponding allocation causes undefined behaviour. For example, if an array new[] expression was used to allocate and create an array then array delete[] must be used for its destruction and release.

The C++ object lifetime model allows to create an object in existing raw memory using non-allocating placement new. Such an object must be destroyed by calling its destructor explicitly, a call to delete causes undefined behaviour.

C++ destructors allow scope-based resource management that should be used to mitigate memory leaks. The standard library provides the class templates std::unique_ptr and std::shared_ptr for managing heap-allocated objects. To avoid issues with constructors throwing exceptions during heap allocation with a new expression and potentially causing leaks, these smart pointers should be obtained through the factory functions std::make_unique() or std::make_shared(). Using shared_ptr can cause memory leaks if it is used to create a cyclic data structure.

If using functions that manage memory using the C library mechanisms is unavoidable, wrapping such a pointer immediately into a specialization of std::unique_ptr<> that uses free() in its deleter object ensures that memory is correctly released when the unique_ptr is destroyed, for example:

struct free_deleter{
  template <typename T>
  void operator()(T *p) const {
     std::free(const_cast<std::remove_const_t<T>*>(p));
  }
};
template <typename T>
using unique_C_ptr=std::unique_ptr<T,free_deleter>;
//...
// abi::__cxa_demangle() returns a pointer to be released with free()
inline auto plain_demangle(char const *name){
  unique_C_ptr<char const> result {abi::__cxa_demangle(name,0,0,0)};
  return result;
}

C++ allocators, i.e., as defined in header <scoped_allocator>, can be used to mitigate heap fragmentation and guarantee deterministic timing through specific allocation strategies, especially with standard library containers. The class hierarchy provided by the header <memory_resource> provide some possible advanced allocation strategies. Users of earlier C++ versions often overloaded operator new and operator delete to achieve similar results.

The library functions std::construct_at() and std::destroy_at() are simpler than and preferrable to using non-allocating placement new and manual destructor calls when those are needed.

6.39.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

C++ software developers can avoid the vulnerability or mitigate its ill effects in the following ways. They can:

6.40 Templates and Generics [SYM]

6.40.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1 clause 6.40 exists in C++. C++ provides the facility Templates to support the generic programming methodology.

C++ provides templates for functions, classes(types), and variables (constants). In addition one can form alias templates for class templates. Template parameters can be types, values (including addresses of global variables), and templates. C++ Templates can have variadic template parameters, that mean any number of arguments of a given kind can be used. Concepts are templates that describe constraints on template arguments and can be used to define template parameters or other deduced contexts.

At compile-time, templates are instantiated with concrete template arguments. Function templates as well as class template constructors can deduce the concrete template argument from the types of the function arguments used in a call. For class templates in addition to the implicit deduction guides provided by its constructors, explicit deduction guides can be specified. This mechanism of template-argument deduction allows one to use templates without explicitly mentioning a template argument for each template parameter. For class templates, only those member functions get instantiated that are actually used. Each template instantiation is checked for syntax, concept and type errors.

Each instantiation of a template is compiled separately, which can cause different instantiations from the same original source code to call different actual functions because of overload resolution.

When comes the time for instantiation of a template, there can be different speciations that match the template arguments. There is an ranking of the different (partial) speecializations that the compiler uses for selection. In case of ambiguities, the compiler will fail. This ranking can be influenced by Concepts and SFINAE (Substitution Failure Is Not An Error). If the chosen specialization compiles but behaves differently than expected, this can be a source of programmer confusion.

To Be Continued 2 Oct 2023 Templates add another level of complexity to overload resolution.

In case of a function overload set that includes function templates, overload resolution happens before template specialization. This means, any desired behaviour through explicit function template specialization is not considered during overload resolution, only the primary template is used there.

Class template and variable template specializations can provide specific code for a given set of template arguments. Such specializations must be defined in the namespace of the primary template. To prevent confusion and different compilation of identical looking template instantiations, a specialization should either be defined in the same file as the generic template, or in case of a specialization for a specific template argument type, in the file of the definition of that type.

Functions and lambdas that define parameters with the use of auto are implicitly templates without using the template keyword.

Variables defined with the use of auto keyword get their concrete type deduced from their initializer, as if they were function template parameters.

A constructor template or assignment operator template is never a copy or move operation and hence does not prevent the implicit definition of a copy or move operations even if it looks similar.

Due to the two phase compilation model of templates, name lookup can be surprising in class templates with dependent base classes. A name used in the derived class that is defined in the base might be found in an outer namespace instead.

double foo{0};
template <typename T> 
struct base {
    int foo;
};
template <typename T> 
struct d : base<T>{
auto bar() {
return foo; // matches global foo not base<T>::foo [1]
}
};

In the above example line [1], in place of foo, either this->foo or the fully qualified name d::foo would refer to the member of the base class.

When used appropriately, templates are suitable for embedded and safety critical systems;

While using template greatly increases type safety, there can be requirements on template arguments that can neither be specified by concepts nor checked by a compiler. For example, sorting elements requires the comparison function to provide a strict weak ordering which is a property of the values of the type to be sorted by which are impossible to check at compile time for all possible value combinations.

C++ provides means to restrict template arguments. One is to use concepts, that can prevent instantiating a template, but allow for substituting it with an alternative. A second means is to use static_assert in a template’s definition to prevent certain instantiations.

template<typename T>
struct wrapper {
T x;
static_assert(not (std::is_pointer_v<T> || std::is_reference_v<T>));
};
template<typename T>
wrapper(T )->wrapper<T>;

wrapper<int> w{42};
wrapper x{&w}; // compile error due to static_assert
wrapper<int&>{}; // compile error due to static_assert

The generic nature of templates require a more elaborate approach to unit tests. Such tests should provide instantiations of the base template and all provided explicit template specializations to ensure that each code path is actually tested. Tests for non-compilability of suppressed instantiations, i.e., through concepts or static_assert, are also beneficial.

Templates allow to reduce the amount of boilerplate code to write, e.g., by providing consistent definitions of operators. However, defining operator function templates in namespace scope can greatly influence compile times due to potential participation in the overload set, whenever the operator is used in code. In addition such generic operator functions might be picked up in inappropriate places causing programmer confusion. Implementing them as hidden friends in a CRTP base class instead makes using operator function templates feasible (see 6.20 Identifier Name Reuse [YOW]).

template <typename T>
struct Plus {
friend constexpr auto operator+(T l, T const &r) {
  return l += r;
}
};
struct Int: Plus<Int> {
constexpr auto operator+=(Int const &r) {
  val += r.val;
  return *this;
}
constexpr Int(int v):val{v}{}
int val;
};
struct Short: Plus<Short> {
constexpr auto operator+=(Short const &r) {
  val += r.val;
  return *this;
}
constexpr Short(short v):val{v}{}
short val;
};
auto x = Int{4} + Int{38};
auto y = Short{4} + Short{2};
<!--

(*We may wish to summarize)*
-->

6.40.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.41 Inheritance [RIP]

6.41.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.41 is applicable to C++.

Inheritance as a mechanism in C++ serves multiple purposes and is defined differently than in most other languages supporting inheritance.

The compiler-provided default behaviour for copy and move operations as well as destruction favors value semantics which conflicts with object-oriented polymorphic behaviour.

If a base class overloads operator new and operator delete, any derived classes will inherit and therefore will use such. If the base class’ operator new and/or operator delete assume the size of the objects being allocated are all the size of the base class and they are not all the same size, then this will result in undefined behaviour such as access errors to memory that wasn’t allocated, overwriting of memory (if there are regions of memory immediately after the last byte allocated), memory leaks, etc. For example, consider,

#include <new>
#include <iostream>

class base
{
public:
  static void* operator new(std::size_t sz)
  {
    std::cerr << "DEBUG: Base::" << __func__ << "(" << sz << ")" << '\n';
    return ::operator new(sizeof(base));
  }

  static void operator delete(void *ptr, std::size_t sz)
  {
    std::cerr << "DEBUG: Base::" << __func__ << "(" << ptr << ',' << sz << ")" << '\n';
    ::operator delete(ptr);
  }

};

class derived : public base
{
  double d;
};


int main()
{
  std::cerr << "DEBUG: sizeof(base): " << sizeof(base) << '\n';
  std::cerr << "DEBUG: sizeof(derived): " << sizeof(derived) << '\n';

  // new derived invokes base::operator new
  derived *p = new derived;

  // delete p invokes base::operator delete
  delete p;
}

If a class-overloaded operator new and operator delete can only handle fixed-sized allocations, then consider the following:

if (sz != sizeof(base))
  return ::operator new(sz);
if (sz != sizeof(base))
  ::operator delete(ptr);

It should also be mentioned that C++ requires operator new to return a valid pointer should its size parameter be zero.

The mechanisms of failure from ISO/IEC TR 24772-1:2019 clause 6.41 manifest and can be mitigated in C++ as follows:

6.41.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.42 Violations of the Liskov Substitution Principle or the Contract Model [BLP]

6.42.1 Applicability to language

The vulnerability as documented in ISO/IEC 24772-1 clause 6.42 applies to C++. C++ leaves verification of the correctness of an overridden call to the programmer.

The vulnerability can be mitigated by a style of programming that uses wrapper functions to check preconditions, calls a virtual function to perform the required functionality and subsequently checks the postconditions before returning. An example is provided below.

class Base  {
  private:
     virtual int function_to_override( int x ) = 0;
     // ...

  public:
     int interface_to_overridden_function( int x ) {
           check_preconditions( x );
           const auto saved = data_saved_for_postcondition( x );
           auto result = function_to_override( x );
           check_postconditions( x, saved, result );
           return result;
         }
     // ...      
 };

6.42.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

See also C++ Core Guidelines C.120, C.121, C.122, C.126, C.127, and C.129 through C.133.

6.43 Redispatching [PPH]

6.43.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.43 exists in C++ for virtual functions, except for constructors and destructors which are not dispatching. An example of the infinite recursion is:

#include <iostream>

class A {
public:
    virtual void f() { std::cout << "A::f()\n"; }
    virtual void g() { std::cout << "A::g()\n"; A::f(); }  //call to f() will not dispatch.
    virtual void h() { std::cout << "A::h()\n"; }
    virtual void i() { std::cout << "A::i()\n"; h(); } //call to h() will dispatch
                                                      //showing the vulnerability
};

class B : public A {
public:
    void f() override { std::cout << "B::f()\n"; g(); }
    void h() override { std::cout << "B::h()\n"; i(); }
};

int main() {
    B b;
    A * pA = &b;
    pA->f(); // no problem
    std::cout << "---\n";
    pA->h(); // infinite recursion
}

In C++, the call to a member function can be qualified, as shown in the above example, and avoids the vulnerability.

6.43.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.44 Polymorphic variables [BKK]

6.44.1 Applicability to language

This vulnerability as described in ISO/IEC TR 24772-1:2019 applies to C++. In addition to the upcast and downcast issues addressed in that document, this clause also addresses cross-casting, which is unique to C++. For further type system related issues see subclause Type System[IHN].

C++ provides language mitigations to help avoid the problems as follows:

Since C++ supports multiple inheritance, up-casting, down-casting, and cross-casting operations can be used to switch to different (pointer/reference) types in the inheritance hierarchy of a specific object, i.e.,

Developers should be aware that virtual member functions can be overridden in derived classes, even if they are private.

Given the following:

struct Z { int z; virtual ~Z() { } };
struct Y { int y; virtual ~Y() { } };
struct A : Z { int a; };
struct B : virtual A { int b; };
struct C : virtual A, Y { int c; };
struct D : B, C { int d; };
D d_inst;

then these examples demonstrate upcasts, downcasts, and crosscasts:

Upcasts:

B* b_ptr = &d_inst; // implicit
C& c_ref = d_inst; // implicit
Z* z_ptr = static_cast<Z*>(&d_inst);
Y* y_ptr = dynamic_cast<Y*>(&d_inst);

Downcasts:

D& d_ref = dynamic_cast<D&>(*y_ptr);
D* d_ptr = static_cast<D*>(b_ptr);

Crosscasts:

C* c_ptr = dynamic_cast<C*>(b_ptr);
Y* y_ptr2 = dynamic_cast<Y*>(b_ptr);
C* c_ptr = static_cast<C*> (static_cast<D*>(b_ptr));

and notes the following about such:

Upcasts:

Downcasts

Crosscasts:

Deleting derived objects via a base class pointer is undefined behavior, unless the base class declares a virtual destructor.

6.44.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

See also C++ Core Guidelines ES.48, ES.49, C.146, C.147, C.148 and C.153. source: OOP52-CPP?

6.45 Extra Intrinsics [LRM]

6.45.1 Applicability to the language

The vulnerability as described in ISO/IEC 24772-1 clause 6.45 applies to C++ as explained below.

C++ implementations are allowed to provide built-in functionality but are restricted to a specific naming schema reserved by the standard. For example, names containing a double underscore or that begin with an underscore and a capital letter are reserved for that purpose. See ISO/IEC 14882 clause [Lex.name]. The use of such names by the programmer is forbidden by the language. Language processors are not required to prohibit such usage, hence the vulnerability exists.

The standard restricts definitions in reserved namespaces, such as std (see ISO/IEC 14882 clause [namespace.constraints]). In addition, specializing a template from namespace std is restricted (see ISO/IEC 14882 clause [namespace.std]) unless explicitly allowed, for example, see ISO/IEC 14882 clause [unord.hash].

6.45.2 Avoidance mechanisms for program users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.46 Argument Passing to Library Functions [TRJ]

6.46.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.46 is applicable to C++.

Libraries that supply objects or functions are in most cases not required to check the validity of parameters passed to them. In those cases where parameter validation is required there might not be adequate parameter validation.

When calling a library, either the calling function or the library may make assumptions about parameters. For example, it may be assumed by a library that a parameter is non-zero so division by that parameter is performed without checking the value. Sometimes some validation is performed by the calling function, but the library may use the parameters in ways that were unanticipated by the calling function resulting in a potential vulnerability. Even when libraries do validate parameters, their response to an invalid parameter is usually undefined and can cause unanticipated results.

This vulnerability applies in particular to C++ libraries which are designed for high efficiency; responsibility for satisfying the preconditions for most functions rests with the caller. When these preconditions are not met, the result will be undefined behaviour. In addition, error conditions are specified by the language for specific functions, such as raising an exception, returning an error code or a known value, such as NaN.

6.46.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.47 Inter-language Calling [DJS]

6.47.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.47 is applicable to C++.

C++ is a multi-paradigm language with a number of features that do not interface simply with other language systems. It is left to the implementation team the task of converting the results of these paradigms to constructs that can cross an interface for further processing in other languages.

C++ compilers provide an application binary interface (ABI) that delineates areas of interoperability with other languages or other C++ compiler/runtime systems. An ABI includes calling conventions, data layout, error and exception handling and return conventions, name mangling, data model, initialization of memory, and linkage to operating systems and libraries.

C++ compilers implement a C++ language linkage and a C language linkage. It is implementation-defined what other languages the implementation supports. Alternatively, other language systems provide linkages to C systems[^3](Ada has developed a standard for interfacing with C. Fortran has included a Clause 15 that explains how to call C functions.), leaving the developer the task of channeling everything through this common language system.

6.47.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

See also the C++ Core Guidelines CPL.3.

AI 63-6 – group – add the guidance from 6.47.2 Interoperability into the Core Guidelines.

6.48 Dynamically-linked Code and Self-modifying Code [NYY]

6.48.1 Applicability to language

Most loaders allow dynamically linked libraries also known as shared libraries. Code is designed and tested using a suite of shared libraries which are loaded at execution time. The process of linking and loading is outside the scope of the C++ standard.

C++ prevents data pointers to be reinterpreted as function pointers and vice versa. Reinterpreting a pointer via a void pointer or std::intptr_t to a pointer of different type is undefined behaviour (with very few defined exceptions of data pointers to pointer to its raw bytes).

6.48.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.49 Library Signature [NSQ]

6.49.1 Applicability to language

The vulnerability as enumerated in ISO/IEC 24772-1 clause 6.49 applies to C++.

As a mitigation, the C++ extern "C" linkage specifier usually provides simple interoperability with libraries using the C application binary interface (ABI).

6.49.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.50. Unanticipated Exceptions from Library Routines [HJW]

6.50.1 Applicability to language

The vulnerability as documented in ISO/IEC TR 24772-1:2019 clause 6.50 exists for C++. In particular the issue of the failing dynamic initialization of namespace-scope objects exists in C++.

When dynamic initialization of a namespace-scope object fails with an exception, the exception cannot be caught and the program is terminated. Function-scope static objects, in contrast, are initialized the first time execution passes through the declaration. Using function-scope static objects in preference to dynamic initialization ensures that there is always an enclosing function that could catch the exception.

exception_prone_type troubling_object;
// An exception from the constructor could cause termination.
// The following function always returns a reference to the same object,
// which is initialized the first time this function is called.

// If initialization fails, it will be retried on the next call.
exception_prone_type& safer_object()
  {
   static exception_prone_type the_safer_object;
   return the_safer_object;
  }

6.50.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.51 Pre-processor Directives [NMP]

6.51.1 Applicability to language

The vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.51 applies to C++.

The C++ pre-processor allows the use of macros that are text-replaced before compilation.

Function-like macros look similar to functions but have different semantics. Because the arguments are text-replaced, expressions passed to a function-like macro may be evaluated multiple times. This can result in unintended and undefined behaviour if the arguments have side effects or are pre-processor directives. Additionally, the arguments and body of function-like macros should be fully parenthesized to avoid unintended and undefined behaviour.

The following code example demonstrates undefined behaviour when a function-like macro is called with arguments that have side-effects (in this case, the increment operator) .

#define CUBE(X) ((X) \* (X) \* (X))
// ...
  int i = 2;
  int a = 81 / CUBE(++i);

The above example could expand to:

  int a = 81 / ((++i) * (++i) * (++i));

which has undefined behaviour so this macro expansion is difficult to predict.

Another mechanism of failure can occur when the arguments within the body of a function-like macro are not fully parenthesized. The following example shows the CUBE macro without parenthesized arguments.

#define CUBE(X) (X \* X \* X)
// ...
int a = CUBE(2 + 1);

This example expands to:

int a = (2 + 1 * 2 + 1 * 2 + 1)

which evaluates to 7 instead of the intended 27.

6.51.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.52 Suppression of Language-defined Run-time Checking [MXB]

6.52.1 Applicability to language

With the exception of the macro assert, the vulnerability as described in ISO/IEC TR 24772-1:2019 clause 6.52 does not apply to C++, because there is no language-defined runtime checking. Macro assert is defined by the standard but is invoked by the programmer, hence is not a language-defined check.

C++ libraries, however, often provide run-time checks which meet the criteria of this vulnerability. Also, compilers and other tools commonly provide means to perform such runtime checks.

6.51.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can use the avoidance mechanisms of ISO/IEC 24772-1 clause 6.52.5 with respect to library and compiler-provided checks, which will almost always require the explicit enabling the checks.

6.53 Provision of Inherently Unsafe Operations [SKL]

6.53.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1:2019 clause 6.53 applies to C++. In particular, anything described by ISO/IEC 14882:2017 as “undefined behaviour” is unsafe.

6.53.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.54 Obscure Language Features [BRS]

6.54.1 Applicability of language

The vulnerability as described in ISO/IEC 24772-1 clause 6.54 applies to C++.

C++ is a rich language and provides facilities for a wide range of application areas with a long history of evolution. The latter means that evolution of the language also means an evolution of best and safe practices. This means that code can look obscure, because it either uses obsolete or very modern language idioms.

6.54.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.55 Unspecified Behaviour [BQF]

6.55.1 Applicability of language

The vulnerability as described in ISO/IEC 24772-1:2019 clause 6.55 applies to C++.

6.55.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.56 Undefined Behaviour [EWF]

6.56.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1:2019 clause 6.56 applies to C++. In ISO/IEC 14882:2017, the terms “undefined behaviour” and “ill-formed, no diagnostic required” expose situations to be avoided.

6.56.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.57 Implementation–defined Behaviour [FAB]

6.57.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1:2019 clause 6.57 applies to C++. In ISO/IEC 14882, the term “implementation-defined” is used to describe implementation-defined behaviour. In addition, the C++ standard provides a dedicated index titled, “Index of implementation-defined behavior”.

6.57.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.58 Deprecated Language Features [MEM]

6.58.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1 clause 6.58 applies to C++. Appendix D “Compatibility features” of ISO/IEC 14882:2020 enumerates the deprecated features. The C++ attribute [[deprecated]] allows library writers and users to mark deprecated declarations.

Although backward compatibility is sometimes offered as an option for compilers so one can avoid changes to code to be compliant with current language specifications, updating the legacy software to the current standard is a better option.

6.58.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.59 Concurrency – Activation [CGA]

6.59.1 Applicability to language

This vulnerability as specified in ISO/IEC 24772-1 clause 6.59 does not apply to C++, as long as the standard library facilities for creating threads are used.

Creating a thread using the std::async function or the std::thread or std::jthread constructors is synchronized with the thread creation site.

Failure to create or start a thread due to lack of system resources will cause an exception to be thrown to the creating thread thus the thread object never exists. For the vulnerabilities with unhandled exceptions see clause 6.36 Ignored error status and unhandled exceptions [OYB].

Any exception thrown within a thread’s function, needs to be handled by that thread, otherwise such an exception will cause program termination. For handling such termination see clause 6.62 Concurrency - Premature termination [CGS].

TODO: talk about detach() and forgetting to join.

6.59.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.60 Concurrency – Directed termination [CGT]

6.60.1 Applicability to language

This vulnerability as specified in ISO/IEC 24772-1 clause 6.60 is mitigated in C++, as long as the standard library facilities for threads are used. C++ does not provide the means to terminate a thread asynchronously. Instead C++ allows cooperative termination through the use of std::stop_token, however, a thread instructed by a stop request to cease execution can ignore such a request. For example, using std::jthread::request_stop() to send a stop request to the started thread, the created thread can have a thread function that never handles such a stop request.

void some_function(int some_arg);
void other_function(std::stop_token tok, int some_arg);

int main(){
    std::jthread t(some_function,42); // stop_token ignored
    std::jthread t2(other_function,42); // stop_token passed
    t.request_stop(); // no-op
    t2.request_stop(); // stop_token tok signalled
} 

In the above example, at the end of the main function the destructors of the thread objects t and t2 will call this->request_stop() and this->join(). If one of the thread functions never returns, then the corresponding join() call will block.

Other programmed mechanisms can be constructed to cause another thread to complete, such as setting a shared variable to a known value that the target thread reads and then terminates itself.

6.60.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.61 Concurrent Data Access [CGX]

6.61.1 Applicability to language

C++ has threading and shared access to variables which have the vulnerabilities described in ISO/IEC TR 24772-1:2019 clause 6.61.1. C++ provides features such as atomic (type template) to guarantee the internal consistency of the data and to prevent corruption of data due to potentially interleaved updates to data elements.

What about concurrent data access between tasks?

Programmers should be aware that conversions or manipulations of data items are not always atomic, such as the conversion of an object as part of a computation

Need the C++ definition of atomic (indivisible access and memory ordering)

and volatile.

The C++ atomic capability can be applied to any basic data type equivalent to char, short, int, long, and long long. When the C++ std::atomic facilities are used, the language guarantees that simultaneous updates and reads to an atomic element will be well-behaved. Atomic does not guarantee the order in which competing reads and/or updates will occur. In order to manage order of access, synchronized locks are required. In order to use the atomic capabilities, each variable must be declared to be of one of the std::atomic types, and the member functions used to compare, load, store or exchange values in an atomic variable.

We also need to move the notion of creating SHARED POINTERS FROM 6.13 TO HERE.

A volatile qualifier on a variable is used to indicate that updates to the variable may happen at any time and outside of program control, hence two subsequent reads on such a variable may return different results.

Programmers should be aware that even simple data accesses on modern architectures can involve instruction reordering, cache issues, and data alignment issues, hence the acquisition time and order are highly nondeterministic, especially when being accessed by concurrent threads. Any data structure that can be shared between threads should be shown to be accessed by at most one thread at a time or should be protected by synchronization mechanisms such as locks (see Lock Protocol Errors [CGM]) or atomicity.

Most concurrent programming algorithms require some level of synchronization between threads or tasks when exchanging information, synchronization that “atomic” does not provide. Mechanisms such as monitors, mailboxes, or mutexes (lock with a queue), futures, condition variables, and locks control scheduling of threads or tasks to control order-of-access and to enforce higher levels of cooperation between schedulable entities.

Atomic tied to memory orders.

Mutexes provide mutual exclusion and guaranteed visibility (consistency) of the shared data.

Mutex is a lock-and-release that is usually hidden.

Encapsulate mutexes and data

Thread-level storage (official term thread_local) has lifetime of the thread. Can exist at local scope or global scope.

For massively parallel concurrency – concurrent access mechanisms not applicable.

No resource management

Exception and exception handling (has some impact on threading)

Memory management issues more complex under concurrency

Volatile should be used for signal handlers to prevent the optimization of replicated accesses to volatile memory. (other) and does not guarantee that the object value will be available to other threads.

Controlling access to shared data (protected or including

Use of volatile (keyword type qualifier) for signal handlers (communicating with hardware?). Prefer volatile for communicating with hardware?

For signal handling, volatile sig_atomic_t or atomic variables can be used to prevent this vulnerability.

6.61.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.62 Concurrency – Premature Termination [CGS]

6.62.1 Applicability to language

Need a statement of applicability.

A thread will terminate when it completes its assigned method, or when it raises an exception, or when it has been explicitly terminated (how is this done)

Joining a thread causes the joining thread to await the joined thread’s termination before continue. Useful for executing in parallel and then proceeding after the dispatched work is complete, but does not notify the joining task if the termination was premature.

In C++ 2020, methods are provided to instruct one or more threads to terminate. This is not premature termination since the requested thread terminates itself.

C++ 2020 provides callbacks in the form of stop_callback to notify the setting thread when a thread of interest has been terminated. It also provides stop_token for a thread to query it is being instructed to terminate.

Any thread can re-throw an exception to be caught by the creator of the terminating thread, (but the parent may have terminated first).

The semantics of C++ is that all children of the main program will terminate if the main program terminates. It is necessary to join the main program to all its children to ensure that children are not silently terminated prematurely.

6.62.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.63 Protocol Lock Errors [CGM]

6.63.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1 clause 6.63 is applicable to C++.

This subclause requires a complete rewrite to have it reflect C++ issues.

Difference between threads and tasks. Can threads and tasks coexist?

Deadlock with single mutex,

The C standard does not provide hidden protocols. Although the vulnerability does not apply to the C language, there could exist an application vulnerability if a program uses synchronization mechanisms incorrectly. For example:

atomic int a;

int b;

/* . . . */

a += b; // This operation is an atomic read-modify-write of the variable ‘a’.

a = a + b; // This statement contains two accesses to ‘a’ and is not atomic.

6.63.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.64 Uncontrolled Format String [SHL]

6.64.1 Applicability to language

The vulnerability as described in ISO/IEC 24772-1 is applicable to C++.

C++ inherits the C libraries which provide a large family of input and output functions that use a control string to interpret the data read or format the output. These strings include all the feature described in ISO/IEC TR 24772-1:2019 clause 6.64.1.

C++ provides type-safe alternatives for input/output, which do not use format strings and which should be used in preference, such as

int aNumber{};
while(std::cin){ // is input still available
  std::cout << "Enter a whole number, please:";
  if (std::cin >> aNumber) { // no format string needed
    std::cout >> "Thank you, the number can be represented as "
    std::cout << std::format("0b{0:b} {0:d} 0{0:o} 0x{0:x}", aNumber);
  } else { // input failed
    std::cin.clear();  // re-enable input
    std::string line;
    getline(std::cin,line); // skip to eol
  }
}

In addition, operator overloading of output operators allows to extend formatting abilities to user-defined types.

6.64.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

6.65 Modifying constants [UJO]

6.65.1 Applicability to language

The vulnerability as documented in ISO/IEC TR 24772-1:2019 clause 6.65 exists in C++.

An object can be declared as const, denoting that its value will not change in its lifetime without invoking mechanisms which have undefined behaviour, e.g., An access path to an object can be declared as const, denoting that the value of the object will not change via this access path without invoking mechanisms which have undefined behaviour, e.g.,

// Example showing 
int const i = 0;              // the simplest access path
int& j = const_cast<int&>(i); // undefined behaviour
void foo(int* p) { *p += 43; }
//...
foo(const_cast<int*>(&i)); // undefined behaviour
foo(&i); // ill-formed, compiler error

It is an illegal program or undefined behaviour to attempt to change a const object, such as i, above.

A object that is not const-qualified can be accessed through a path that is const-qualified:

  int k = 0;
  int const & j = k;                // 'j' is a const reference to 'k'
  int const * p = &k;               // 'p' is a pointer to const 'k'
  const_cast<int const &>(k);       // The type of the expression is const

The checking for the correctness of const is enforced based on the access-path and not the type of the target object. For example, the following are ill-formed as the access path of the left-hand expression is const-qualified:

  i = 0; // int const i;
  j = 0; // int const &j
  *p = 0; // int const *p
  const_cast<int const &>(k) = 0; // int k's declaration was not const

Note that the object k referred to by j, *p and the const_cast, is not constant.
In each case the access path could be changed to remove const making the program well-formed: const_cast<int&> (j) = 0; // well-formed

While it is possible to remove the const-qualification for an access path, attempting to modify a const object this way is undefined-behavior(see Undefined Behavior [EWF]) : const_cast<int&> (i) = 0; // undefined behavior

A constant can also be legitimately modified via a secondary access path. For example: !!! Needs review re: implied aliasing. Is it undefined behaviour?!!!

#include <cassert>
int volatile k = 0;
void break_it() 
{
  k = 42; // legal
}

void test(int const volatile& j)
{
  assert(j == 0); // will pass since k == 0
  break_it();
  assert(j == 0); // will fail since k != 0
}

  test(k);

We distinguish between qualifications on the pointer’s type (pointer type) and qualifications on the type being referenced (pointee type).

A pointer type can be qualified as const, however the qualification only applies to the pointer type and not the pointee’s type. A reference type is implicitly immutable, only the referred type can be const qualified.

  using T = int;
  using T1 = T &;
  using T2 = T *;
  using S1 = T1 const;  // The const is ignored, S1 has type 'T &'
  using S2 = T2 const;  // The const applies to the pointer type,
                        // S2 has type 'T * const'

  void foo (S1 s1, S2 s2)
  {
    s1 = 0;            // well-formed
    *s2 = 0;           // well-formed
  }

A common misconception is that a member function qualified with const cannot modify any of its members. The following badly defined class introduces a non-const access path to a potentially const object:

  struct A
  {
    A * pA;                // Pointer to non-const A
    int array[2];            // Array of type int

    A () : pA{this}{}      // pA provides access path to non-const
                           // this.

    void f () const
    {
      // pA = nullptr;     // ill-formed
      // array[0] = 0;     // ill-formed

      pA->array[0] = 0;    // compiles, but undefined behavior
                           // if executed on a const object
    }
  };

In the const member function f, naming array directly results in a const-qualified access path and so an attempt to modify it is ill-formed. However, the type of pA is A * const, that is a const pointer to a non-const A. An attempt to modify pA is ill-formed, however, modification of the value pointed to by pA is not a const-qualified access path and so is not ill-formed.

The programmer can incorrectly assume that a call to a const member function will not modify the object. However, as has been shown above, there is no guarantee that this is the case. The following example, which follows from the example above, will compile but has undefined behavior as a result of the modification of the const object:

  void foo ()
  { 
     A a1 {} ;
     A const a2 {} ;
     a1.f();           // OK - 'a1' is not const
     a2.f();           // compiles but has undefined behavior
  }

C++ classes wrapping pointer or reference members can be used to provide transitivity of const within const member functions. This is shown by the MyRef type in the following example:

template <typename T> 
struct MyRef
{
  // ...
  operator T&() &;
  operator T const &() const &;
  MyRef & operator=(T const &) &;

private:
  T & m_t;
};
 
struct A {
  A();

  void f1() {
    m_i = 0;
    m_j = 0;
    m_j ++;
    ++m_j;
  }

  void f() const {
    m_i = 0;     // compiles, but undefined behavior
                 // if 'm_i' refers to a const object

    m_j = 0;     // ill-formed
    ++ m_j;      // ill-formed
  }

  int & m_i;
  MyRef<int> m_j;
};

Attempts to modify the object referenced by m_j are ill-formed when they occur in the const member function f2.

C++ container iterator types, iterator and const_iterator, are examples of use of this pattern.

If a member variable is declared with the mutable keyword, then it can still be modified, even if the containing object is const. It is preferable to use mutable rather than removing the constness of the containing object (see Conversion Errors [FLC]).
Members declared mutable typically should not contribute to the value of the object. The following is a common example where a mutex member is declared mutable to allow locking in a const member function:

#include <mutex>

class MyQueue
{
public:
  bool empty () const 
  {
      std::lock_guard sg (m_mutex); // lock the mutex, which requires m_mutex to be writable
      return m_head != nullptr;
  }

  // ...

private:
  mutable std::mutex m_mutex;
  int * m_head { nullptr };
};

6.65.2 Avoidance mechanisms for language users

To avoid the vulnerability or mitigate its ill effects, C++ software developers can:

7. Language specific vulnerabilities for C

7.2 Copy/move semantics from Classes. (Peter Sommerlad’s paper at http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1412r0.pdf

8. Implications for standardization

Future standardization efforts should consider:

Bibliography

[1] ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards, 2004

[2] ISO/IEC TR 10000‑1, Information technology — Framework and taxonomy of International Standardized Profiles — Part 1: General principles and documentation framework

[3] ISO 10241 (all parts), International terminology standards

[4] ISO/IEC 9899:2011, Information technologyProgramming languages — C

[5] ISO/IEC 9899:2011/Cor.1:2012, Technical Corrigendum 1

[6] ISO/IEC 30170:2012, Information technologyProgramming languages — Ruby

[7] ISO/IEC/IEEE 60559:2011, Information technology – Microprocessor Systems – Floating-Point arithmetic

[8] ISO/IEC 1539-1:2010, Information technology — Programming languages — Fortran — Part 1: Base language

[9] ISO/IEC 8652:1995, Information technology — Programming languages — Ada

[10] ISO/IEC 14882:2011, Information technologyProgramming languages — C++

[11] R. Seacord, The CERT C Secure Coding Standard. Boston,MA: Addison-Westley, 2008.

[12] Motor Industry Software Reliability Association. Guidelines for the Use of the C Language in Vehicle Based Software, 2012 (third edition)16F2.

[13] ISO/IEC TR24731–1, Information technology — Programming languages, their environments and system software interfaces — Extensions to the C library — Part 1: Bounds-checking interfaces

[14] ISO/IEC TR 15942:2000, Information technology — Programming languages — Guide for the use of the Ada programming language in high integrity systems

[15] Joint Strike Fighter Air Vehicle: C++ Coding Standards for the System Development and Demonstration Program. Lockheed Martin Corporation. December 2005.

[16] Motor Industry Software Reliability Association. Guidelines for the Use of the C++ Language in critical systems, June 2008

[17] ISO/IEC TR 24718: 2005, Information technology — Programming languages — Guide for the use of the Ada Ravenscar Profile in high integrity systems

[18] L. Hatton, Safer C: developing software for high-integrity and safety-critical systems. McGraw-Hill 1995

[19] ISO/IEC 15291:1999, Information technology — Programming languages — Ada Semantic Interface Specification (ASIS)

[20] Software Considerations in Airborne Systems and Equipment Certification. Issued in the USA by the Requirements and Technical Concepts for Aviation (document RTCA SC167/DO-178B) and in Europe by the European Organization for Civil Aviation Electronics (EUROCAE document ED-12B).December 1992.

[21] IEC 61508: Parts 1-7, Functional safety: safety-related systems. 1998. (Part 3 is concerned with software).

[22] ISO/IEC 15408: 1999 Information technology. Security techniques. Evaluation criteria for IT security.

[23] J Barnes, High Integrity Software - the SPARK Approach to Safety and Security. Addison-Wesley. 2002.

[25] Steve Christy, Vulnerability Type Distributions in CVE, V1.0, 2006/10/04

[26] ARIANE 5: Flight 501 Failure, Report by the Inquiry Board, July 19, 1996 http://esamultimedia.esa.int/docs/esa-x-1819eng.pdf

[27] Hogaboom, Richard, A Generic API Bit Manipulation in C, Embedded Systems Programming, Vol 12, No 7, July 1999 http://www.embedded.com/1999/9907/9907feat2.htm

[28] Carlo Ghezzi and Mehdi Jazayeri, Programming Language Concepts, 3rd edition, ISBN-0-471-10426-4, John Wiley & Sons, 1998

[29] Lions, J. L. ARIANE 5 Flight 501 Failure Report. Paris, France: European Space Agency (ESA) & National Center for Space Study (CNES) Inquiry Board, July 1996.

[30] Seacord, R. Secure Coding in C and C++. Boston, MA: Addison-Wesley, 2005. See http://www.cert.org/books/secure-coding for news and errata.

[31] John David N. Dionisio. Type Checking. http://myweb.lmu.edu/dondi/share/pl/type-checking-v02.pdf

[32] MISRA Limited. "MISRA C: 2012 Guidelines for the Use of the C Language in Critical Systems." Warwickshire, UK: MIRA Limited, March 2013 (ISBN 978-1-906400-10-1 and 978-1-906400-11-8).

[33] The Common Weakness Enumeration (CWE) Initiative, MITRE Corporation, (http://cwe.mitre.org/)

[34] Goldberg, David, What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys, vol 23, issue 1 (March 1991), ISSN 0360-0300, pp 5-48.

[35] IEEE Standards Committee 754. IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard 754-2008. Institute of Electrical and Electronics Engineers, New York, 2008.

[36] Robert W. Sebesta, Concepts of Programming Languages, 8th edition, ISBN-13: 978-0-321-49362-0, ISBN-10: 0-321-49362-1, Pearson Education, Boston, MA, 2008

[37] Bo Einarsson, ed. Accuracy and Reliability in Scientific Computing, SIAM, July 2005 http://www.nsc.liu.se/wg25/book

[38] GAO Report, Patriot Missile Defense: Software Problem Led to System Failure at Dhahran, Saudi Arabia, B-247094, Feb. 4, 1992, http://archive.gao.gov/t2pbat6/145960.pdf

[39] Robert Skeel, Roundoff Error Cripples Patriot Missile, SIAM News, Volume 25, Number 4, July 1992, page 11, http://www.siam.org/siamnews/general/patriot.htm

[40] CERT. CERT C++ Secure Coding Standardhttps://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=637 (2009).

[41] Holzmann, Garard J., Computer, vol. 39, no. 6, pp 95-97, Jun., 2006, The Power of 10: Rules for Developing Safety-Critical Code

[42] P. V. Bhansali, A systematic approach to identifying a safe subset for safety-critical software, ACM SIGSOFT Software Engineering Notes, v.28 n.4, July 2003

[43] Ada 95 Quality and Style Guide, SPC-91061-CMC, version 02.01.01. Herndon, Virginia: Software Productivity Consortium, 1992. Available from: http://www.adaic.org/docs/95style/95style.pdf

[44] Ghassan, A., & Alkadi, I. (2003). Application of a Revised DIT Metric to Redesign an OO Design. Journal of Object Technology , 127-134.

[45] Subramanian, S., Tsai, W.-T., & Rayadurgam, S. (1998). Design Constraint Violation Detection in Safety-Critical Systems. The 3rd IEEE International Symposium on High-Assurance Systems Engineering , 109 - 116.

[46] Lundqvist, K and Asplund, L., “A Formal Model of a Run-Time Kernel for Ravenscar”, The 6th International Conference on Real-Time Computing Systems and Applications – RTCSA 1999

[47] ISO/IEC TS 17961, Information technology – Programming languages, their environments and system software interfaces – C secure coding rules

[48] GNU Project. GCC Bugs “Non-bugs” http://gcc.gnu.org/bugs.html#nonbugs_c (2009).

Index

LHS (left-hand side), 22


  1. ::: {custom-style=“footnote text”} This has been addressed by WG 14 in an optionally normative annex in the current working paper↩︎

  2. ::: {custom-style=“footnote text”} The first edition should not be used or quoted in this work.↩︎