The vulnerability as described in ISO/IEC TR 24772-1 clause 6.8 exists in C++ when arrays are managed using raw pointers or indexing. The range of valid raw pointers to a plain array a
are from the first element to one past the last element of the array, i.e., in the range [std::begin(a)
..std::end(a)
). An object o
can be treated as a single element array with respect to pointers referring to it.
C++ provides facilities to encapsulate code that is exposed to this vulnerability. The standard library defines features that mitigate or circumvent this vulnerability. For example, std::string
, std::vector
, std::deque
, and iostreams
manage buffers internally; using “range-for” such as for (auto &e :
some container
)
and the algorithm library to access elements e
of the container without the possibility of a buffer boundary violation.
However, the member function data()
of the contiguous sequence containers returns a non-const pointer to the underlying elements. This allows manipulating the underlying memory directly, bypassing the safety features of the container leading to this vulnerability. For example, std::string::data()
returns a non-const char*
.
When working directly with iterators referring a container, one need to ensure that those iterators are and remain valid. For example, for a container c
incrementing an iterator beyond the end(c)
iterator or dereferencing the iterator denoted by end(c)
are undefined behavior [EWF].
In general, validity of iterators requires programmer care to prevent out-of-bounds access of the underlying container:
For example, using algorithms and iterators correctly to convert an input string to lower case:
std::string to_lowercase(std::string_view s){
std::string result{};
transform(// input range #1
begin(s), end(s), std::back_inserter(result), // output iterator #2
char c){ return std::tolower(c);});
[](return result;
}
The above example, passes two ranges of characters to the transform algorithm. Potential errors due to a boundary violation could be caused by the following changes:
begin(result)
instead of back_inserter(result)
,The second problem occurs in the following code if the length of s is longer than 31:
std::string to_lowercase(std::string_view s){
std::string result{'\0', 31};
transform(
begin(s), end(s), // error, only space for 31 characters
begin(result), char c){ return std::tolower(c);});
[](return result; // size(result) == 31
}
An additional problem occurs when performing an operation that invalidates an in-use iterator, such as the iterator internally used by the range-for statement below:
std::string to_lowercase(std::string s){
for (auto &c:s){
std::tolower(c)); // error, invalidates in-use iterator
s.append(
}return s;
}
Another way that overflows can occur is through the use of C-style strings, which can be treated as arrays of characters, but mishandling of the nul
termination can make overflows possible. See clause 6.7 String Termination[CJM].
Since plain (C-style) arrays when passed as function arguments decay to pointers the array dimension is lost. C++ provides several means of keeping the array dimension available to the called function:
std::array
as parameter type,std::views::counted
or another view as parameter type,std::span
as parameter type for plain arrays,std::string_view
as parameter type in favour of char const*
, orFor further explanation and examples, see
To avoid the vulnerability or mitigate its ill effects, C++ software developers can:
Use the avoidance mechanisms of ISO/IEC 24772-1 clause 6.8.5.
Avoid C-style arrays. If unavoidable, guidance for the use of C-style arrays is provided in TR 24772-3 clause 6.8.2.
Avoid container functions, such as data()
, that bypass the safety features of the respective containers.
To model a fixed-size array, use a library class such as std::array
.
To model arrays with dynamically changing size, use containers of the standard library, such as std::vector
or std::deque
.
Avoid using a pointer parameter or a pointer-and-size parameter pair for representing a contiguous buffer; instead use a range parameter, for example, std::views::counted
, std::span
, or std::string_view
.
Prefer using range-based or iterator-based algorithms, such as those of the standard library, over pointer-manipulating or indexing loops.
Use the range-based for loop construct to iterate within the defined bounds of a range.
Ensure that ranges and iterators used by range-for or passed to algorithms are and remain valid.
When performing random access by indexing, use the avoidance mechanisms of clause 6.9.2.
Use static analysis tools to detect buffer boundary violations.