SlideShare a Scribd company logo
1 
(8) Basics of the C++ Programming Language 
Nico Ludwig (@ersatzteilchen)
2 
TOC 
● (8) C++ Basics 
– Introducing CPU Registers 
– Function Stack Frames and the Decrementing Stack 
– Function Call Stacks, the Stack Pointer and the Base Pointer 
– C/C++ Calling Conventions 
– Stack Overflow, Underflow and Channelling incl. Examples 
– How variable Argument Lists work with the Stack 
– Static versus automatic Storage Classes 
– The static Storage Class and the Data Segment 
● Sources: 
– Bjarne Stroustrup, The C++ Programming Language 
– Charles Petzold, Code 
– Oliver Müller, Assembler 
– Rob Williams, Computer System Architecture 
– Jerry Cain, Stanford Course CS 107
3 
A little Introduction to CPU Registers 
CPU RAM 
ALU 
● RAM is relatively slow, but big. 
Rn 
... 
R2 
R1 
R0 
Registers 
● The Central Processing Unit's (CPU) registers are tiny compared to the RAM, but very fast. 
– There is a set of 4B or 8B general purpose registers and some dedicated registers. 
– The registers have electronic connections to the whole RAM. 
– Registers can read from RAM (update) and write to RAM (flush). 
● The Arithmetic Logical Unit (ALU) handles int arithmetics and logical operations. 
– The ALU has electronic connections to the registers. 
● As we're going to discuss the stack, which is only managed by the hardware. We 
need to know a little more about the hardware in this respect. 
● The shape of the ALU (like a Y) underscores the idea of having two or more 
operands and one result. 
● The basic parts of a CPU are the registers, the ALU and the control unit (CU), 
which controls program execution (the fetch-execute cycle). 
● The dimensions of the RAM and the registers in the graphic are not realistic. The 
registers are very much smaller than the RAM. The graphic presents less than a 
microprocessor minimal system, we concentrate only on required details. 
● Indeed there is a kind of memory hierarchy: The registers and the CPU caches 
are of very small, but very fast (register: 0.x ns, cache: some ns) and made of 
very expensive static memory, RAM is of moderate size, speed and prize 
(dynamic memory) and finally the memory of solid state drives (SSDs), 
magnetic and optical devices is huge, relatively slow and cheap. 
● A new upcoming computer-memory architecture is the non-uniform memory access 
(NUMA), in which each CPU has local memory as well as all CPUs sharing a 
common memory. 
● The shown connections are part of the CPU-internal bus system. 
● The CPU-internal bus between the registers and the ALU defines the architecture-classification 
of the CPU (a 32b-bus makes a CPU a 32b-CPU as 32b can be 
processed in one CPU cycle). - Also the width of the external and internal data bus 
and of the registers plays a role in the classification. 
● Why are registers either 4B or 8B big? 
● It depends on the CPU, a whole data word should be storable by a register: 4B 
for a 32b machine, 8B for a 64b machine. 
● Normally arithmetic operations are not directly performed in the RAM. 
● Connecting the ALU direct to the memory would be slow and/or expensive. 
● Actually registers are filled with data from the RAM (update), then taken to the 
ALU where the operation takes place, then the result is sent back to the 
registers and finally copied to the RAM (flush).
4 
Important CPU Registers (x86) 
● Generally there exist four general purpose data registers: 
– They can be freely used by the executing program. 
– EAX (AX, RAX), EBX (BX, RBX), ECX (CX, RCX) and EDX (DX, RDX). 
– Trivial names: accumulator, base register, counter register and data register. 
● Segment registers: 
– They store the "coordinates" or "bounds" of the segmented memory. 
– Code segment (CS), data segment (DS), stack segment (SS) and extra segment (ES). 
● We'll primary deal with stack navigation and pointer registers in this lecture: 
– Stack pointer (SP), base pointer (BP) and instruction pointer (IP). 
● Flags register: 
– The flags register signal carry-overs, overflow etc.. 
● There are more registers on a 64b CPU. Lesser 
64b data fit into the caches, so the caches will 
become filled quite soon (therefor 64b CPUs 
have bigger caches). Acceleration is esp. 
noticeable on Intel Macs, where we have more 
registers. Not so much on Power PC Macs, 
where we already have the full set of registers, 
important is the increased addressable memory 
here primarily. 
● Normally the types "pointer" and long will be 
influenced by 64-bitness (LP64) (some type field 
characters for string formatting must be modified 
(%d (32 only) -> %ld and %x -> %p) and also 
sizeofs should be used instead of constants as 4 
(e.g. on calling memory-functions)).
5 
The Stack Frame of a Function 
4B 
? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
● When A() is called, space for its (auto) locals will be allocated. 
– This memory block is called stack frame (sf) or activation record. 
– The sf is usually aggressively packed. 
● The stack segment is very empty at the start of the program, because only few functions and (auto) locals exist then. 
● Let's represent A()'s sf with the symbol on upcoming slides. 
Stack segment 
void A() { 
int a; 
short b[4]; 
double c; 
B(); 
C(); 
} 
? ? ? 
a 
b 
c 
20B 
higher addresses 
● Why do we discuss the stack after we have 
discussed the heap? 
● Because the handling of the stack is more difficult 
to understand, as the hardware algorithms that 
control the stack need to be understood. - Esp. 
the order of elements on the stack and the order 
of actions on the stack is very relevant! The heap 
is rather simple: the programmers are responsible 
to handle it, they have to define conventions! 
● The sf does also contain memory for variables 
defined in blocks, but this can be optimized on some 
compilers. Up to C99 it was not allowed to put the 
definition of variables in blocks. 
● The discussed stack is aligned, it means that some 
bytes are sacrificed in order to get simple access to 
stack elements having addresses on a multiple of 
the word size. - This is compiler and settings 
specific, but the simplest model to explain the stack. 
● This is a simple view of the sf, we'll refine it during 
the next slides.
6 
The Call Stack of Functions on a decrementing Stack 
● The stack pointer (SP) points to the stack address of the currently active sf. 
– Calling A() decrements the SP by at least sizeof( ). 
● This depends on the platform, but decrementing is usual. 
● All sfs before A() was called are still existing! 
● "In" A() the SP is the offset for the (auto) local variables. 
– The addresses of local variables in a sf do usually shrink (e.g. for A(): (int)&a > (int)&b). 
● This is also platform dependent. 
● Call stack management: 
– Calling and returning from 
functions adds/pops the stack. 
– This leads to inc/dec of the SP. 
– The SP resides in a dedicated CPU register. => The stack is managed by hardware. 
SP 
void B() { 
int x; 
char* y; 
char* z[4]; 
C(); 
} 
void C() { 
double m[3]; 
int n; 
} 
void A() { 
int a; 
short b[4]; 
double c; 
B(); 
C(); 
} 
● Because the SP needs to be decremented for each stack 
variable, for each function call and for the sf construction 
this stack is called "decrementing stack". 
● As the stack grows to lower addresses (i.e. against free 
memory) it grows "against" the heap. The heap may grow 
to higher addresses (i.e. also "against" free memory). 
When the stack and heap meet each other in memory, 
memory is exhausted. Usually the stack is exhausted first. 
● In this example we can also inspect the nature of the stack 
as "last in first out" (LIFO) container. - The last item that 
was put into the stack will be the next item that will be 
taken from the stack. 
● Similar to the heap, the values being left by already 
popped sfs do stay in the memory as long as they have 
not been overwritten by the next function call, they are just 
no longer legally accessible. - This can also be a source of 
bugs. 
● Notice how the picture showing the call stack also makes 
clear how recursive functions an quickly consume much 
stack space and overflow.
7 
Function Arguments and the Stack 
● The argument values are stored on the stack from right to left. 
– And they are stored in the stack from higher to lower addresses. 
● The (other) local variables follow the arguments on the stack to lower addresses. 
– A function-call's first "activity" is to create space for arguments and locals on the stack. 
● A function stores from where it was called in the "saved program counter" (SPC). 
– "Between" arguments and local variables on the stack, the SPC (4B) will be stored. 
● Arguments, local variables and the SPC make up the full sf of a function. 
void A(int foo, int* bar) { 
char c[4]; 
short* s; 
//... 
} 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
bar 
c 
s 
20B 
foo 
SPC 
higher addresses 
● On Reduced Instruction Set Computing (RISC) 
CPUs there exist so called "Register Windows" to 
project different stacks into the current stack frame 
with a single operation, so it's a fast way to pass 
arguments to functions. The general idea with 
RISC CPUs is to reduce memory access and 
stack operations. 
● There exist architectures that have no stack at all 
(we discuss only the ones having a stack).
8 
The Function Call – partial sf and Arguments 
int i = 42; 
A(78, &i); 
void A(int foo, int* bar) { 
char c[4]; 
short* s; //... 
} 
● When A() is called, a partial sf is created that contains just all the arguments. 
– (All actions under this bullet are done on the caller side.) 
– Arguments are stored on the stack from right to left and from higher to lower addresses. 
● The SP gets decremented for the size of all of the arguments. 
– When A()'s content is executed the SP contains the lowest relevant address. 
– The content of IP (the address after A()'s call or return address) is stored in the SPC. 
● On the callee side (in A()) the sf needs to be completed with the local variables: 
– A()'s (auto) locals are stored on the stack afterwards. 
● This decrements the SP for (4 * sizeof(char)) + sizeof(short*), i.e. for the size of both locals. 
– Then the function runs and "does its job". 
– (We ignore here: the registers that are used by A() will also be pushed on the stack.) 
● The caller needs to fill the "argument part" of the 
sf, because only the caller knows all the 
arguments. The callee needs to fill the "local auto 
part", because only the caller knows all the local 
auto variables. 
● Normally the content of the SP register is stored 
in the base pointer (BP) register (also called 
environment pointer) in the function. From the BP 
then the offsets to the local variables are 
calculated. The SP contains the offset address 
(within the stack segment) to the next item in the 
stack during execution.
9 
The Function Call – Returning and Cleaning up 
● Before A() returns it increments the SP by 4 * sizeof(char) + sizeof(short*). 
– This clears the stack from the locals. 
● (The registers that have been used by A() will be popped from the stack.) 
● Then a potential return value is copied into the RV (EAX) register. 
● The function will return to the address stored in the SPC. 
– Also the IP and the SP will now "get back" its content before calling A(). 
● Cleaning the stack from the arguments depends on the calling convention: 
– With __cdecl: the caller needs to pop them from the stack and to reset the SP. 
– With __stdcall: the callee needs to pop them from the stack and to reset the SP. 
– (We can use compiler specific keywords or settings to declare calling conventions.) 
● The calling convention __cdecl is a C/C++ compiler's default, __stdcall is the 
calling convention of the Win32 API, because it works better with non-C/C++ 
languages. __cdecl requires to prefix a function's name with an underscore 
when calling it (this is the exported name, on which the linker operates). A 
function compiled with __stdcall carries the size of its parameters in its name 
(this is also the exported name), - Need to encode the size of bytes or the 
parameters: If a __cdecl function calls a __stdcall function, the __stdcall 
function would clean the stack and after the __stdcall function returns the 
__cdecl function would clean the stack again. - The naming of the exported 
symbol of __stdcall functions allow the caller to know how many bytes to 
"hop", because they've already been removed by the __stdcall function. 
Carrying the size in a function name is not required with __cdecl, because 
the caller needs to clean the stack. - This feature allowed C to handle 
variadic functions with __cdecl (nowadays the platform independent variadic 
macros can be used in C and C++). 
● Other calling conventions: 
● pascal: This calling convention copies the arguments to the stack from left 
to right, the callee needs to clean the stack. 
● fastcall: This calling convention combines __cdecl with the usage of 
registers to pass parameters to get better performance. It is often used 
for inline functions. The callee needs to clean the stack. The register 
calling convention is often the default for 64b CPUs. 
● thiscall: This calling convention is used for member functions. It combines 
__cdecl with passing a pointer to the member's instance as if it was the 
leftmost parameter. 
● In this example the RV (EAX on x86) register can only store values of 4B. In 
reality the operation can be more difficult. 
● For floaty results the FPU's stack (ST0) is used. 
● User defined types (e.g. structs) are stored to an address that is passed 
to the function silently. 
● It is usually completely different on micro controllers.
10 
Stack Overflows – Simple Example 
void Foo() { 
int i; 
int array[4]; 
for (i = 0; i <= 4; ++i) { 
array[i] = 0; 
} 
} 
array[3] 
● Because we run over the boundaries of array we modify other parts of the stack. 
– So array[4] is *(array + 4) and i's content resides there and i will be set to 0 again. 
– When i is 0 the for loop starts again... 
● This kind of buffer overflow is kind of harmless, it just ends in an infinite loop. 
– But it does damage the stack! 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
SPC 
i 
array[2] 
array[1] 
? ? ? ? array[0] 
array[4] 
● Can anybody spot the error in Foo()?
11 
Points to keep in Mind about Functions 
● Generally functions accept and return values from and to the stack. 
● The required memory for calling a function is called stack frame (sf). 
– The stack frame is created when a function is called. 
● By default the values of the arguments and the return value are copied. 
– The default in C/C++ is call by value. 
● The function calling details depend on the calling convention: 
– It defines how arguments are being copied (order) to the stack or to registers. 
– It defines who's responsible to pop arguments from the stack. 
– It defines who's responsible to reset the SP. 
● Recursive functions can consume many sfs (call stacks) and can quickly overflow. 
● Some compilers (and languages like F#) are able 
to enable tail recursion. Tail recursion means, that 
if the last statement of a function is the recursive 
call, the call can be done w/o using the stack to 
store auto variables (incl. parameters).
12 
Stack Overflows/Overrun and Underflows/Underrun 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
foo 
SPC 
overflow 
● The SP can be used as offset to access the (auto) locals and function arguments. 
– In "negative" below-the-SP-direction we can access (auto) locals. 
– In "positive" above-the-SP-direction we can access the SPC and arguments. 
● Stack overflow and underflow mean that stack pushes and pops are unbalanced. 
– Writing the stack above (too many pushes) the SP is called stack overflow. 
– Writing the stack below (too many pops, SP - sizeof(locals)) is called stack underflow. 
● Both effects are downright errors that are prevented during run time meanwhile. 
– But... in past (until today!) these have been exploited for... exploits. 
bar 
c 
s 
higher addresses 
SP 
underflow 
● What is an exploit? 
● A stack overflow leads to overwriting already used 
stack memory, a stack underflow means that stack 
content that is not used by "us" is read. 
● It should be said that for the following examples to 
compile and run many stack protections needed to 
be deactivated on the compiler level. If the 
protections remained activated, the compiler 
would add stack guard elements into the code and 
we would get run time errors, before the stack 
violation could get effective and dangerous.
13 
Stack Overflows – Effects with different Byte Orders 
void Foo() { 
array[3] 
● Because we run over the boundaries of array we modify other parts of the stack. 
– Now we have a short array having a different stack layout as in the last example. 
– So array[4] is *(array + 4) and on that location i resides and i's lower 2B are set to 0. 
– On a big endian system nothing happens; on the lower 2B are already 0s. 
– On a little endian system the lower 2B hold the 4 and this 4 will be set to 0. 
– => An infinity loop will only happen on a little endian system. 
i 
● This is of course a nasty problem as we have to deal with different effects on different machines with the same source 
code. 
int i; 
short array[4]; 
for (i = 0; i <= 4; ++i) { 
array[i] = 0; 
} 
} 
? ? ? ? 
? ? ? ? 
? ? 
? ? 
? ? 
SPC 
array[1] 
? ? array[0] 
array[4] 
array[2] 
● This is an example of how problems are silently 
emerging.
14 
Stack Overflows – Leading to a never ending Recursion 
void Foo() { 
int array[4]; 
int i; 
for (i = 0; i <= 4; ++i) { 
array[i] -= 4; 
} 
} 
array[3] 
array[2] 
● Same error, but array is now on a higher address than i, and the elements are decremented by 4. 
– When i reaches the value 4, erroneously the SPC is addressed! 
– Then the content of the SPC (i.e. Foo()'s return address) is decremented by 4. 
– The SPC – 4 is exactly the address from where Foo() was called! 
– The new return address in the SPC will now return to the call address of Foo()! 
– Finally Foo() will be called again. (The -4 is a "negative one instruction" in our case.) 
– => It will end (or never end) in an infinite call chain. 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
? ? ? ? 
SPC 
array[1] 
array[0] 
? ? ? ? i 
array[4] 
Foo(); 
● This effect is present in our memory model. 
Whether this effect emerges is highly dependent 
on our platform (e.g. calling convention). Some 
runtimes can spot the error on the stack (e.g. 
gcc/OS X). - Nevertheless it is an error!
15 
Stack Overflows – Stack Channelling 
● After we have called DeclareAndInitArray() a part of the sf has still the old values! 
– Keep in mind that only the SP is moved on stack pops, the stack is never "cleared". 
● The function PrintArray() has exactly the same stack layout. 
– So the locals (also i) have the same values that DeclareAndInitArray() has left! 
● (It has nothing to do with the locals having the same names each!) 
● This effect is called channelling. 
void DeclareAndInitArray() 
{ 
int a[100]; 
int i; 
for (i = 0; i < 100; ++i) 
{ 
a[i] = i; 
} 
} 
DeclareAndInitArray(); 
PrintArray(); 
// >0 
// >1 
// >... 
// >99 
void PrintArray() 
{ 
int a[100]; 
int i; 
for (i = 0; i < 100; ++i) 
{ 
std::cout<<a[i]<<std::endl; 
} 
} 
● Stack channelling is interesting for hardware near 
code as we find it in drivers. 
● It should be said that all these manipulations on 
the stack can still lead to undefined behavior. This 
is because we are often about to write memory 
that is not owned by us, and also mind that the 
stack could be differently organized on different 
platforms (e.g. no decrementing stack).
16 
Variable Argument Lists 
char buffer [10]; 
std::sprintf(buffer, "%d %d", 4, 4); // Four arguments. 
std::sprintf(buffer, "%d + %d = %d", 4, 4, 8); // Five arguments. 
● How can we cdecrementationall std::sprintf() with different argument lists? 
– Actually we could pass more rightside arguments matching the format string. 
– The function std::sprintf() does not use overloads, but it has a variable argument list. 
● How does it work? 
int sprintf(char* buffer, const char* format, ...); 
– The compiler calculates the required stack depending on the arguments and decrements the SP by the required offset. 
– As arguments are laid down on the stack from right to left, the buffer is on offset 0. 
– And the format is always on offset 1. 
– Then the format is analyzed and the awaited offsets are read from the stack. 
● In this case an offset of 4B for each int passed in the variable argument list. 
● All standard C/C++ functions have the calling 
convention __cdecl. Only __cdecl allows variable 
argument lists, because only the caller knows the 
argument list and only the caller can then pop the 
arguments. __stdcall functions execute a little bit 
faster than __cdecl functions, because the stack 
needs not to be cleaned on the callee's side (i.e. 
within a __stdcall function).
17 
The Mystery of returning C-String Literals 
● We know that we can't return pointers to stack elements from a function. 
– The pointers are meaningless to the caller, as the memory is already stack-popped: 
int* GetValues() { // Defining a function that returns a pointer to 
int values[] = {1, 2, 3}; // the locally defined array (created on stack). 
return values; // This pointer points to the 1st item of values. 
} 
//------------------------------------------------------------------------------------------------------ 
int vals* = GetValues(); // Seman. wrong! vals points to a 
std::cout<<"2. val is: "<<vals[1]<<std::endl; // discarded memory location. 
// The array "values" is gone away, vals points to its scraps, probably rubbish! 
● But c-string literals can be legally returned! - How can that work? 
const char* GetString() { // Defining a function that returns a c-string literal. 
return "Hello there!"; 
} 
//------------------------------------------------------------------------------------------------------ 
const char* s = GetString(); 
std::cout<<"The returned c-string is: "<<s<<std::endl; // Ok! 
// >"The returned c-string is: Hello there!".
18 
The static Storage Class 
● We discussed the automatic storage class. 
– It makes up the stack of functions and stores (auto) local variables. 
– It allows passing arguments to functions and returning results from functions. 
● We discussed dynamic memory. 
– It allows us to deal with memory manually and gives us full control. 
● Is this all? No! We forgot an important aspect, an important memory portion! 
– Where are global and free objects stored? 
– Where are literals of primitive types, esp. c-string literals stored? 
● => These are stored in the static memory, defined by the static storage class. 
● Dynamic memory is not an explicit storage class 
in C/C++.
19 
Static Objects, local static Objects and the C/C++ Linker 
● Local statics are global variables with a local scope. (Sounds weird, but it's true.) 
● Local static objects are used rarely: Their usage leads to "magic" code. 
● The C/C++ linker is responsible for static objects. 
– It'll initialize all uninitialized statics to 0. Always! 
– Maybe it'll optimize equal c-strings literals together with the compiler (string pooling). 
– It'll prepare to store readonly statics (literals) in the data segment. 
– So: Many static objects may prolong the link process. 
● The runtime will init statics at startup time, all statics are destroyed on shut down. So: Many static objects may prolong the 
startup and shut down time. 
void Foo() { 
// A static local int. (Not an auto local int!) 
static int i; 
} 
● Why string pooling? 
● Because it can reduce the size of the resulting 
executable! 
● The initialization/destruction strategy of non-local 
statics should be clear. Why? 
● Well, "globals" need to be initialized before the 
program runs and destroyed when the program 
ends. 
● So: all statics have the lifetime of the program! 
● The initialization order of non-local statics is 
undefined (it often depends on the link procedure), 
but some standard C++ objects like std::cout and 
std::cin are guaranteed to be initialized before any 
user defined non-local is initialized.
20 
Memory Segmentation – The Data Segment 
● C/C++' static memory resides in the data/BSS segment during run time. 
– To make this work the C/C++ linker will reserve space in an o-file's data/BSS section. 
void Foo() { 
static int i; 
} 
Main.exe (Win32 PE) 
.data/.BSS Section 
Heap and Stack Segments 
Data/BSS Segment 
Code Segment 
C/C++ Compiler 
C/C++ Linker 
Run time 
const char* Boo() { 
return "Hello there!"; 
0 i 
} 
namespace Nico { 
const int MAGIC_NUMBER = 42; 
} 
"Hello there!" .data + 4 42 Nico::MAGIC_NUMBER 
0 i "Hello there!" .data + 4 42 Nico::MAGIC_NUMBER 
● Keeping data and code in the same memory is an important 
aspect of the "von Neumann architecture". 
● The .BSS section/segment (historical abbreviation for Block 
Started by Symbol) is a part of the .data section/segment that 
is dedicated to static/global objects that are not explicitly 
initialized by the programmer (like i). 
● The presentation of this memory is a simplified version of real 
mode memory, where the memory separation into data and 
code segment was introduced. Basing on the real mode, the 
protected mode was developed: If code tried executing data in 
the data segment, the CPU would issue a hardware interrupt 
that would immediately stop program execution. 
● Modern OS' also use the protected mode, but with a flat 
memory model, where all segments reside in the same linear 
address range. So, the above mentioned segment based 
protection doesn't work. Instead of segments, OS' rely on 
pages. As pages can only be marked as being readonly or 
read/write, additional information was needed to mark code as 
being not executable. - The No eXecute (NX) bit was 
introduced by AMD (at AMD it is also called Enhanced Virus 
Protection (EVP), Intel calls it eXecute Disable (XD) bit and 
Microsoft calls it Data Execution Prevention (DEP)). - Trying to 
execute code in "NX-memory", will again issue a hardware 
interrupt. Other CPU manufactures (e.g. IBM/PowerPC) had 
similar technologies much earlier.
21 
Practical Example: automatic versus static Storage Class 
void Boo() { 
auto int i; // Using the (in this case) superfluous keyword "auto". 
static int s; 
++s; 
std::cout<<"s: "<<s<<", i: "<<i<<std::endl; 
} 
Boo(); // statics are 0-initialized, autos are uninitialised: 
// >s: 1, i: -87667 
Boo(); // statics survive a stack frame, autos get popped from the stack: 
// >s: 2, i: 13765 
● Summary: an automatic versus a static storage class object: 
– We can define static objects in our functions and those will "survive the stack". 
● I.e. they survive a function's stack frame. Global, local and constant statics live in the data segment. 
● In opposite to auto variables that live on the stack! 
– The C/C++ linker initializes static objects and its members with 0. 
● Automatic variables are not getting initialized automatically! 
– Therefor we'll often hear about the automatic and static memory duration. 
● In C/C++ there exist following storage classes: 
auto, static, register, extern and mutable. Esp. the 
storage classes extern and mutable need more 
discussion in future lectures.
22 
Thank you!

More Related Content

What's hot (20)

PPTX
Computer organization and architecture
Subesh Kumar Yadav
 
PPT
11 instruction sets addressing modes
Sher Shah Merkhel
 
PPT
Chapter 3 INSTRUCTION SET AND ASSEMBLY LANGUAGE PROGRAMMING
Frankie Jones
 
PPT
Instruction Set Architecture
Haris456
 
PPT
Csa stack
PCTE
 
PPT
Instruction Set Architecture (ISA)
Gaditek
 
PPT
isa architecture
AJAL A J
 
PPT
Types of instructions
ihsanjamil
 
PPTX
Lecture 3 instruction set
Pradeep Kumar TS
 
PPT
Instruction set
Kamini Benare
 
PPTX
Instruction Set Architecture
Dilum Bandara
 
PPT
Assembly Language Basics
Education Front
 
PPT
Assembler design option
Mohd Arif
 
PPT
Ch 11
Rishi Trivedi
 
PPTX
Addressing Modes
Mayank Garg
 
PDF
Lecture5(1)
misgina Mengesha
 
PPTX
Part III: Assembly Language
Ahmed M. Abed
 
DOCX
MASM -UNIT-III
Dr.YNM
 
PPTX
Assembler1
jayashri kolekar
 
PDF
Emu8086
rangarajb2005
 
Computer organization and architecture
Subesh Kumar Yadav
 
11 instruction sets addressing modes
Sher Shah Merkhel
 
Chapter 3 INSTRUCTION SET AND ASSEMBLY LANGUAGE PROGRAMMING
Frankie Jones
 
Instruction Set Architecture
Haris456
 
Csa stack
PCTE
 
Instruction Set Architecture (ISA)
Gaditek
 
isa architecture
AJAL A J
 
Types of instructions
ihsanjamil
 
Lecture 3 instruction set
Pradeep Kumar TS
 
Instruction set
Kamini Benare
 
Instruction Set Architecture
Dilum Bandara
 
Assembly Language Basics
Education Front
 
Assembler design option
Mohd Arif
 
Addressing Modes
Mayank Garg
 
Lecture5(1)
misgina Mengesha
 
Part III: Assembly Language
Ahmed M. Abed
 
MASM -UNIT-III
Dr.YNM
 
Assembler1
jayashri kolekar
 
Emu8086
rangarajb2005
 

Similar to (8) cpp stack automatic_memory_and_static_memory (20)

PPTX
C++ Memory Management
Rahul Jamwal
 
PPT
Malware Analysis - x86 Disassembly
Natraj G
 
PDF
Buffer overflow attack
Prithiviraj Prithiviraj
 
PDF
CNIT 127 Ch Ch 1: Before you Begin
Sam Bowne
 
PDF
CNIT 127 Ch 1: Before you Begin
Sam Bowne
 
PDF
127 Ch 2: Stack overflows on Linux
Sam Bowne
 
PDF
CNIT 127: Ch 2: Stack overflows on Linux
Sam Bowne
 
PPTX
Ch 3 CPU.pptx Architecture computer organization
ermiasgesgis
 
PDF
ARM procedure calling conventions and recursion
Stephan Cadene
 
PDF
Memory Management Strategies - I.pdf
Harika Pudugosula
 
PPTX
Reversing malware analysis training part4 assembly programming basics
Cysinfo Cyber Security Community
 
PDF
Embedded C - Lecture 3
Mohamed Abdallah
 
PDF
Basic buffer overflow part1
Payampardaz
 
PPTX
How Functions Work
Saumil Shah
 
PPTX
Technical Interview
prashant patel
 
PPTX
CP Lecture-2 The Microprocessor and its Architecture.pptx
syedamahnoor582
 
PDF
Hacker Thursdays: An introduction to binary exploitation
OWASP Hacker Thursday
 
PDF
CNIT 127: Ch 2: Stack Overflows in Linux
Sam Bowne
 
PDF
Session01_Intro.pdf
RahnerJames
 
PPSX
Microprocessor architecture II
Dr.YNM
 
C++ Memory Management
Rahul Jamwal
 
Malware Analysis - x86 Disassembly
Natraj G
 
Buffer overflow attack
Prithiviraj Prithiviraj
 
CNIT 127 Ch Ch 1: Before you Begin
Sam Bowne
 
CNIT 127 Ch 1: Before you Begin
Sam Bowne
 
127 Ch 2: Stack overflows on Linux
Sam Bowne
 
CNIT 127: Ch 2: Stack overflows on Linux
Sam Bowne
 
Ch 3 CPU.pptx Architecture computer organization
ermiasgesgis
 
ARM procedure calling conventions and recursion
Stephan Cadene
 
Memory Management Strategies - I.pdf
Harika Pudugosula
 
Reversing malware analysis training part4 assembly programming basics
Cysinfo Cyber Security Community
 
Embedded C - Lecture 3
Mohamed Abdallah
 
Basic buffer overflow part1
Payampardaz
 
How Functions Work
Saumil Shah
 
Technical Interview
prashant patel
 
CP Lecture-2 The Microprocessor and its Architecture.pptx
syedamahnoor582
 
Hacker Thursdays: An introduction to binary exploitation
OWASP Hacker Thursday
 
CNIT 127: Ch 2: Stack Overflows in Linux
Sam Bowne
 
Session01_Intro.pdf
RahnerJames
 
Microprocessor architecture II
Dr.YNM
 
Ad

More from Nico Ludwig (20)

PPTX
Grundkurs fuer excel_part_v
Nico Ludwig
 
PPTX
Grundkurs fuer excel_part_iv
Nico Ludwig
 
PPTX
Grundkurs fuer excel_part_iii
Nico Ludwig
 
PPTX
Grundkurs fuer excel_part_ii
Nico Ludwig
 
PPTX
Grundkurs fuer excel_part_i
Nico Ludwig
 
PDF
(2) gui drawing
Nico Ludwig
 
ODP
(2) gui drawing
Nico Ludwig
 
PDF
(1) gui history_of_interactivity
Nico Ludwig
 
ODP
(1) gui history_of_interactivity
Nico Ludwig
 
PDF
New c sharp4_features_part_vi
Nico Ludwig
 
PDF
New c sharp4_features_part_v
Nico Ludwig
 
PDF
New c sharp4_features_part_iv
Nico Ludwig
 
ODP
New c sharp4_features_part_iii
Nico Ludwig
 
PDF
New c sharp4_features_part_ii
Nico Ludwig
 
PDF
New c sharp4_features_part_i
Nico Ludwig
 
PDF
New c sharp3_features_(linq)_part_v
Nico Ludwig
 
PDF
New c sharp3_features_(linq)_part_iv
Nico Ludwig
 
ODP
New c sharp3_features_(linq)_part_iv
Nico Ludwig
 
PDF
New c sharp3_features_(linq)_part_iii
Nico Ludwig
 
PDF
New c sharp3_features_(linq)_part_ii
Nico Ludwig
 
Grundkurs fuer excel_part_v
Nico Ludwig
 
Grundkurs fuer excel_part_iv
Nico Ludwig
 
Grundkurs fuer excel_part_iii
Nico Ludwig
 
Grundkurs fuer excel_part_ii
Nico Ludwig
 
Grundkurs fuer excel_part_i
Nico Ludwig
 
(2) gui drawing
Nico Ludwig
 
(2) gui drawing
Nico Ludwig
 
(1) gui history_of_interactivity
Nico Ludwig
 
(1) gui history_of_interactivity
Nico Ludwig
 
New c sharp4_features_part_vi
Nico Ludwig
 
New c sharp4_features_part_v
Nico Ludwig
 
New c sharp4_features_part_iv
Nico Ludwig
 
New c sharp4_features_part_iii
Nico Ludwig
 
New c sharp4_features_part_ii
Nico Ludwig
 
New c sharp4_features_part_i
Nico Ludwig
 
New c sharp3_features_(linq)_part_v
Nico Ludwig
 
New c sharp3_features_(linq)_part_iv
Nico Ludwig
 
New c sharp3_features_(linq)_part_iv
Nico Ludwig
 
New c sharp3_features_(linq)_part_iii
Nico Ludwig
 
New c sharp3_features_(linq)_part_ii
Nico Ludwig
 
Ad

Recently uploaded (20)

PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 

(8) cpp stack automatic_memory_and_static_memory

  • 1. 1 (8) Basics of the C++ Programming Language Nico Ludwig (@ersatzteilchen)
  • 2. 2 TOC ● (8) C++ Basics – Introducing CPU Registers – Function Stack Frames and the Decrementing Stack – Function Call Stacks, the Stack Pointer and the Base Pointer – C/C++ Calling Conventions – Stack Overflow, Underflow and Channelling incl. Examples – How variable Argument Lists work with the Stack – Static versus automatic Storage Classes – The static Storage Class and the Data Segment ● Sources: – Bjarne Stroustrup, The C++ Programming Language – Charles Petzold, Code – Oliver Müller, Assembler – Rob Williams, Computer System Architecture – Jerry Cain, Stanford Course CS 107
  • 3. 3 A little Introduction to CPU Registers CPU RAM ALU ● RAM is relatively slow, but big. Rn ... R2 R1 R0 Registers ● The Central Processing Unit's (CPU) registers are tiny compared to the RAM, but very fast. – There is a set of 4B or 8B general purpose registers and some dedicated registers. – The registers have electronic connections to the whole RAM. – Registers can read from RAM (update) and write to RAM (flush). ● The Arithmetic Logical Unit (ALU) handles int arithmetics and logical operations. – The ALU has electronic connections to the registers. ● As we're going to discuss the stack, which is only managed by the hardware. We need to know a little more about the hardware in this respect. ● The shape of the ALU (like a Y) underscores the idea of having two or more operands and one result. ● The basic parts of a CPU are the registers, the ALU and the control unit (CU), which controls program execution (the fetch-execute cycle). ● The dimensions of the RAM and the registers in the graphic are not realistic. The registers are very much smaller than the RAM. The graphic presents less than a microprocessor minimal system, we concentrate only on required details. ● Indeed there is a kind of memory hierarchy: The registers and the CPU caches are of very small, but very fast (register: 0.x ns, cache: some ns) and made of very expensive static memory, RAM is of moderate size, speed and prize (dynamic memory) and finally the memory of solid state drives (SSDs), magnetic and optical devices is huge, relatively slow and cheap. ● A new upcoming computer-memory architecture is the non-uniform memory access (NUMA), in which each CPU has local memory as well as all CPUs sharing a common memory. ● The shown connections are part of the CPU-internal bus system. ● The CPU-internal bus between the registers and the ALU defines the architecture-classification of the CPU (a 32b-bus makes a CPU a 32b-CPU as 32b can be processed in one CPU cycle). - Also the width of the external and internal data bus and of the registers plays a role in the classification. ● Why are registers either 4B or 8B big? ● It depends on the CPU, a whole data word should be storable by a register: 4B for a 32b machine, 8B for a 64b machine. ● Normally arithmetic operations are not directly performed in the RAM. ● Connecting the ALU direct to the memory would be slow and/or expensive. ● Actually registers are filled with data from the RAM (update), then taken to the ALU where the operation takes place, then the result is sent back to the registers and finally copied to the RAM (flush).
  • 4. 4 Important CPU Registers (x86) ● Generally there exist four general purpose data registers: – They can be freely used by the executing program. – EAX (AX, RAX), EBX (BX, RBX), ECX (CX, RCX) and EDX (DX, RDX). – Trivial names: accumulator, base register, counter register and data register. ● Segment registers: – They store the "coordinates" or "bounds" of the segmented memory. – Code segment (CS), data segment (DS), stack segment (SS) and extra segment (ES). ● We'll primary deal with stack navigation and pointer registers in this lecture: – Stack pointer (SP), base pointer (BP) and instruction pointer (IP). ● Flags register: – The flags register signal carry-overs, overflow etc.. ● There are more registers on a 64b CPU. Lesser 64b data fit into the caches, so the caches will become filled quite soon (therefor 64b CPUs have bigger caches). Acceleration is esp. noticeable on Intel Macs, where we have more registers. Not so much on Power PC Macs, where we already have the full set of registers, important is the increased addressable memory here primarily. ● Normally the types "pointer" and long will be influenced by 64-bitness (LP64) (some type field characters for string formatting must be modified (%d (32 only) -> %ld and %x -> %p) and also sizeofs should be used instead of constants as 4 (e.g. on calling memory-functions)).
  • 5. 5 The Stack Frame of a Function 4B ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ● When A() is called, space for its (auto) locals will be allocated. – This memory block is called stack frame (sf) or activation record. – The sf is usually aggressively packed. ● The stack segment is very empty at the start of the program, because only few functions and (auto) locals exist then. ● Let's represent A()'s sf with the symbol on upcoming slides. Stack segment void A() { int a; short b[4]; double c; B(); C(); } ? ? ? a b c 20B higher addresses ● Why do we discuss the stack after we have discussed the heap? ● Because the handling of the stack is more difficult to understand, as the hardware algorithms that control the stack need to be understood. - Esp. the order of elements on the stack and the order of actions on the stack is very relevant! The heap is rather simple: the programmers are responsible to handle it, they have to define conventions! ● The sf does also contain memory for variables defined in blocks, but this can be optimized on some compilers. Up to C99 it was not allowed to put the definition of variables in blocks. ● The discussed stack is aligned, it means that some bytes are sacrificed in order to get simple access to stack elements having addresses on a multiple of the word size. - This is compiler and settings specific, but the simplest model to explain the stack. ● This is a simple view of the sf, we'll refine it during the next slides.
  • 6. 6 The Call Stack of Functions on a decrementing Stack ● The stack pointer (SP) points to the stack address of the currently active sf. – Calling A() decrements the SP by at least sizeof( ). ● This depends on the platform, but decrementing is usual. ● All sfs before A() was called are still existing! ● "In" A() the SP is the offset for the (auto) local variables. – The addresses of local variables in a sf do usually shrink (e.g. for A(): (int)&a > (int)&b). ● This is also platform dependent. ● Call stack management: – Calling and returning from functions adds/pops the stack. – This leads to inc/dec of the SP. – The SP resides in a dedicated CPU register. => The stack is managed by hardware. SP void B() { int x; char* y; char* z[4]; C(); } void C() { double m[3]; int n; } void A() { int a; short b[4]; double c; B(); C(); } ● Because the SP needs to be decremented for each stack variable, for each function call and for the sf construction this stack is called "decrementing stack". ● As the stack grows to lower addresses (i.e. against free memory) it grows "against" the heap. The heap may grow to higher addresses (i.e. also "against" free memory). When the stack and heap meet each other in memory, memory is exhausted. Usually the stack is exhausted first. ● In this example we can also inspect the nature of the stack as "last in first out" (LIFO) container. - The last item that was put into the stack will be the next item that will be taken from the stack. ● Similar to the heap, the values being left by already popped sfs do stay in the memory as long as they have not been overwritten by the next function call, they are just no longer legally accessible. - This can also be a source of bugs. ● Notice how the picture showing the call stack also makes clear how recursive functions an quickly consume much stack space and overflow.
  • 7. 7 Function Arguments and the Stack ● The argument values are stored on the stack from right to left. – And they are stored in the stack from higher to lower addresses. ● The (other) local variables follow the arguments on the stack to lower addresses. – A function-call's first "activity" is to create space for arguments and locals on the stack. ● A function stores from where it was called in the "saved program counter" (SPC). – "Between" arguments and local variables on the stack, the SPC (4B) will be stored. ● Arguments, local variables and the SPC make up the full sf of a function. void A(int foo, int* bar) { char c[4]; short* s; //... } ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? bar c s 20B foo SPC higher addresses ● On Reduced Instruction Set Computing (RISC) CPUs there exist so called "Register Windows" to project different stacks into the current stack frame with a single operation, so it's a fast way to pass arguments to functions. The general idea with RISC CPUs is to reduce memory access and stack operations. ● There exist architectures that have no stack at all (we discuss only the ones having a stack).
  • 8. 8 The Function Call – partial sf and Arguments int i = 42; A(78, &i); void A(int foo, int* bar) { char c[4]; short* s; //... } ● When A() is called, a partial sf is created that contains just all the arguments. – (All actions under this bullet are done on the caller side.) – Arguments are stored on the stack from right to left and from higher to lower addresses. ● The SP gets decremented for the size of all of the arguments. – When A()'s content is executed the SP contains the lowest relevant address. – The content of IP (the address after A()'s call or return address) is stored in the SPC. ● On the callee side (in A()) the sf needs to be completed with the local variables: – A()'s (auto) locals are stored on the stack afterwards. ● This decrements the SP for (4 * sizeof(char)) + sizeof(short*), i.e. for the size of both locals. – Then the function runs and "does its job". – (We ignore here: the registers that are used by A() will also be pushed on the stack.) ● The caller needs to fill the "argument part" of the sf, because only the caller knows all the arguments. The callee needs to fill the "local auto part", because only the caller knows all the local auto variables. ● Normally the content of the SP register is stored in the base pointer (BP) register (also called environment pointer) in the function. From the BP then the offsets to the local variables are calculated. The SP contains the offset address (within the stack segment) to the next item in the stack during execution.
  • 9. 9 The Function Call – Returning and Cleaning up ● Before A() returns it increments the SP by 4 * sizeof(char) + sizeof(short*). – This clears the stack from the locals. ● (The registers that have been used by A() will be popped from the stack.) ● Then a potential return value is copied into the RV (EAX) register. ● The function will return to the address stored in the SPC. – Also the IP and the SP will now "get back" its content before calling A(). ● Cleaning the stack from the arguments depends on the calling convention: – With __cdecl: the caller needs to pop them from the stack and to reset the SP. – With __stdcall: the callee needs to pop them from the stack and to reset the SP. – (We can use compiler specific keywords or settings to declare calling conventions.) ● The calling convention __cdecl is a C/C++ compiler's default, __stdcall is the calling convention of the Win32 API, because it works better with non-C/C++ languages. __cdecl requires to prefix a function's name with an underscore when calling it (this is the exported name, on which the linker operates). A function compiled with __stdcall carries the size of its parameters in its name (this is also the exported name), - Need to encode the size of bytes or the parameters: If a __cdecl function calls a __stdcall function, the __stdcall function would clean the stack and after the __stdcall function returns the __cdecl function would clean the stack again. - The naming of the exported symbol of __stdcall functions allow the caller to know how many bytes to "hop", because they've already been removed by the __stdcall function. Carrying the size in a function name is not required with __cdecl, because the caller needs to clean the stack. - This feature allowed C to handle variadic functions with __cdecl (nowadays the platform independent variadic macros can be used in C and C++). ● Other calling conventions: ● pascal: This calling convention copies the arguments to the stack from left to right, the callee needs to clean the stack. ● fastcall: This calling convention combines __cdecl with the usage of registers to pass parameters to get better performance. It is often used for inline functions. The callee needs to clean the stack. The register calling convention is often the default for 64b CPUs. ● thiscall: This calling convention is used for member functions. It combines __cdecl with passing a pointer to the member's instance as if it was the leftmost parameter. ● In this example the RV (EAX on x86) register can only store values of 4B. In reality the operation can be more difficult. ● For floaty results the FPU's stack (ST0) is used. ● User defined types (e.g. structs) are stored to an address that is passed to the function silently. ● It is usually completely different on micro controllers.
  • 10. 10 Stack Overflows – Simple Example void Foo() { int i; int array[4]; for (i = 0; i <= 4; ++i) { array[i] = 0; } } array[3] ● Because we run over the boundaries of array we modify other parts of the stack. – So array[4] is *(array + 4) and i's content resides there and i will be set to 0 again. – When i is 0 the for loop starts again... ● This kind of buffer overflow is kind of harmless, it just ends in an infinite loop. – But it does damage the stack! ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? SPC i array[2] array[1] ? ? ? ? array[0] array[4] ● Can anybody spot the error in Foo()?
  • 11. 11 Points to keep in Mind about Functions ● Generally functions accept and return values from and to the stack. ● The required memory for calling a function is called stack frame (sf). – The stack frame is created when a function is called. ● By default the values of the arguments and the return value are copied. – The default in C/C++ is call by value. ● The function calling details depend on the calling convention: – It defines how arguments are being copied (order) to the stack or to registers. – It defines who's responsible to pop arguments from the stack. – It defines who's responsible to reset the SP. ● Recursive functions can consume many sfs (call stacks) and can quickly overflow. ● Some compilers (and languages like F#) are able to enable tail recursion. Tail recursion means, that if the last statement of a function is the recursive call, the call can be done w/o using the stack to store auto variables (incl. parameters).
  • 12. 12 Stack Overflows/Overrun and Underflows/Underrun ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? foo SPC overflow ● The SP can be used as offset to access the (auto) locals and function arguments. – In "negative" below-the-SP-direction we can access (auto) locals. – In "positive" above-the-SP-direction we can access the SPC and arguments. ● Stack overflow and underflow mean that stack pushes and pops are unbalanced. – Writing the stack above (too many pushes) the SP is called stack overflow. – Writing the stack below (too many pops, SP - sizeof(locals)) is called stack underflow. ● Both effects are downright errors that are prevented during run time meanwhile. – But... in past (until today!) these have been exploited for... exploits. bar c s higher addresses SP underflow ● What is an exploit? ● A stack overflow leads to overwriting already used stack memory, a stack underflow means that stack content that is not used by "us" is read. ● It should be said that for the following examples to compile and run many stack protections needed to be deactivated on the compiler level. If the protections remained activated, the compiler would add stack guard elements into the code and we would get run time errors, before the stack violation could get effective and dangerous.
  • 13. 13 Stack Overflows – Effects with different Byte Orders void Foo() { array[3] ● Because we run over the boundaries of array we modify other parts of the stack. – Now we have a short array having a different stack layout as in the last example. – So array[4] is *(array + 4) and on that location i resides and i's lower 2B are set to 0. – On a big endian system nothing happens; on the lower 2B are already 0s. – On a little endian system the lower 2B hold the 4 and this 4 will be set to 0. – => An infinity loop will only happen on a little endian system. i ● This is of course a nasty problem as we have to deal with different effects on different machines with the same source code. int i; short array[4]; for (i = 0; i <= 4; ++i) { array[i] = 0; } } ? ? ? ? ? ? ? ? ? ? ? ? ? ? SPC array[1] ? ? array[0] array[4] array[2] ● This is an example of how problems are silently emerging.
  • 14. 14 Stack Overflows – Leading to a never ending Recursion void Foo() { int array[4]; int i; for (i = 0; i <= 4; ++i) { array[i] -= 4; } } array[3] array[2] ● Same error, but array is now on a higher address than i, and the elements are decremented by 4. – When i reaches the value 4, erroneously the SPC is addressed! – Then the content of the SPC (i.e. Foo()'s return address) is decremented by 4. – The SPC – 4 is exactly the address from where Foo() was called! – The new return address in the SPC will now return to the call address of Foo()! – Finally Foo() will be called again. (The -4 is a "negative one instruction" in our case.) – => It will end (or never end) in an infinite call chain. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? SPC array[1] array[0] ? ? ? ? i array[4] Foo(); ● This effect is present in our memory model. Whether this effect emerges is highly dependent on our platform (e.g. calling convention). Some runtimes can spot the error on the stack (e.g. gcc/OS X). - Nevertheless it is an error!
  • 15. 15 Stack Overflows – Stack Channelling ● After we have called DeclareAndInitArray() a part of the sf has still the old values! – Keep in mind that only the SP is moved on stack pops, the stack is never "cleared". ● The function PrintArray() has exactly the same stack layout. – So the locals (also i) have the same values that DeclareAndInitArray() has left! ● (It has nothing to do with the locals having the same names each!) ● This effect is called channelling. void DeclareAndInitArray() { int a[100]; int i; for (i = 0; i < 100; ++i) { a[i] = i; } } DeclareAndInitArray(); PrintArray(); // >0 // >1 // >... // >99 void PrintArray() { int a[100]; int i; for (i = 0; i < 100; ++i) { std::cout<<a[i]<<std::endl; } } ● Stack channelling is interesting for hardware near code as we find it in drivers. ● It should be said that all these manipulations on the stack can still lead to undefined behavior. This is because we are often about to write memory that is not owned by us, and also mind that the stack could be differently organized on different platforms (e.g. no decrementing stack).
  • 16. 16 Variable Argument Lists char buffer [10]; std::sprintf(buffer, "%d %d", 4, 4); // Four arguments. std::sprintf(buffer, "%d + %d = %d", 4, 4, 8); // Five arguments. ● How can we cdecrementationall std::sprintf() with different argument lists? – Actually we could pass more rightside arguments matching the format string. – The function std::sprintf() does not use overloads, but it has a variable argument list. ● How does it work? int sprintf(char* buffer, const char* format, ...); – The compiler calculates the required stack depending on the arguments and decrements the SP by the required offset. – As arguments are laid down on the stack from right to left, the buffer is on offset 0. – And the format is always on offset 1. – Then the format is analyzed and the awaited offsets are read from the stack. ● In this case an offset of 4B for each int passed in the variable argument list. ● All standard C/C++ functions have the calling convention __cdecl. Only __cdecl allows variable argument lists, because only the caller knows the argument list and only the caller can then pop the arguments. __stdcall functions execute a little bit faster than __cdecl functions, because the stack needs not to be cleaned on the callee's side (i.e. within a __stdcall function).
  • 17. 17 The Mystery of returning C-String Literals ● We know that we can't return pointers to stack elements from a function. – The pointers are meaningless to the caller, as the memory is already stack-popped: int* GetValues() { // Defining a function that returns a pointer to int values[] = {1, 2, 3}; // the locally defined array (created on stack). return values; // This pointer points to the 1st item of values. } //------------------------------------------------------------------------------------------------------ int vals* = GetValues(); // Seman. wrong! vals points to a std::cout<<"2. val is: "<<vals[1]<<std::endl; // discarded memory location. // The array "values" is gone away, vals points to its scraps, probably rubbish! ● But c-string literals can be legally returned! - How can that work? const char* GetString() { // Defining a function that returns a c-string literal. return "Hello there!"; } //------------------------------------------------------------------------------------------------------ const char* s = GetString(); std::cout<<"The returned c-string is: "<<s<<std::endl; // Ok! // >"The returned c-string is: Hello there!".
  • 18. 18 The static Storage Class ● We discussed the automatic storage class. – It makes up the stack of functions and stores (auto) local variables. – It allows passing arguments to functions and returning results from functions. ● We discussed dynamic memory. – It allows us to deal with memory manually and gives us full control. ● Is this all? No! We forgot an important aspect, an important memory portion! – Where are global and free objects stored? – Where are literals of primitive types, esp. c-string literals stored? ● => These are stored in the static memory, defined by the static storage class. ● Dynamic memory is not an explicit storage class in C/C++.
  • 19. 19 Static Objects, local static Objects and the C/C++ Linker ● Local statics are global variables with a local scope. (Sounds weird, but it's true.) ● Local static objects are used rarely: Their usage leads to "magic" code. ● The C/C++ linker is responsible for static objects. – It'll initialize all uninitialized statics to 0. Always! – Maybe it'll optimize equal c-strings literals together with the compiler (string pooling). – It'll prepare to store readonly statics (literals) in the data segment. – So: Many static objects may prolong the link process. ● The runtime will init statics at startup time, all statics are destroyed on shut down. So: Many static objects may prolong the startup and shut down time. void Foo() { // A static local int. (Not an auto local int!) static int i; } ● Why string pooling? ● Because it can reduce the size of the resulting executable! ● The initialization/destruction strategy of non-local statics should be clear. Why? ● Well, "globals" need to be initialized before the program runs and destroyed when the program ends. ● So: all statics have the lifetime of the program! ● The initialization order of non-local statics is undefined (it often depends on the link procedure), but some standard C++ objects like std::cout and std::cin are guaranteed to be initialized before any user defined non-local is initialized.
  • 20. 20 Memory Segmentation – The Data Segment ● C/C++' static memory resides in the data/BSS segment during run time. – To make this work the C/C++ linker will reserve space in an o-file's data/BSS section. void Foo() { static int i; } Main.exe (Win32 PE) .data/.BSS Section Heap and Stack Segments Data/BSS Segment Code Segment C/C++ Compiler C/C++ Linker Run time const char* Boo() { return "Hello there!"; 0 i } namespace Nico { const int MAGIC_NUMBER = 42; } "Hello there!" .data + 4 42 Nico::MAGIC_NUMBER 0 i "Hello there!" .data + 4 42 Nico::MAGIC_NUMBER ● Keeping data and code in the same memory is an important aspect of the "von Neumann architecture". ● The .BSS section/segment (historical abbreviation for Block Started by Symbol) is a part of the .data section/segment that is dedicated to static/global objects that are not explicitly initialized by the programmer (like i). ● The presentation of this memory is a simplified version of real mode memory, where the memory separation into data and code segment was introduced. Basing on the real mode, the protected mode was developed: If code tried executing data in the data segment, the CPU would issue a hardware interrupt that would immediately stop program execution. ● Modern OS' also use the protected mode, but with a flat memory model, where all segments reside in the same linear address range. So, the above mentioned segment based protection doesn't work. Instead of segments, OS' rely on pages. As pages can only be marked as being readonly or read/write, additional information was needed to mark code as being not executable. - The No eXecute (NX) bit was introduced by AMD (at AMD it is also called Enhanced Virus Protection (EVP), Intel calls it eXecute Disable (XD) bit and Microsoft calls it Data Execution Prevention (DEP)). - Trying to execute code in "NX-memory", will again issue a hardware interrupt. Other CPU manufactures (e.g. IBM/PowerPC) had similar technologies much earlier.
  • 21. 21 Practical Example: automatic versus static Storage Class void Boo() { auto int i; // Using the (in this case) superfluous keyword "auto". static int s; ++s; std::cout<<"s: "<<s<<", i: "<<i<<std::endl; } Boo(); // statics are 0-initialized, autos are uninitialised: // >s: 1, i: -87667 Boo(); // statics survive a stack frame, autos get popped from the stack: // >s: 2, i: 13765 ● Summary: an automatic versus a static storage class object: – We can define static objects in our functions and those will "survive the stack". ● I.e. they survive a function's stack frame. Global, local and constant statics live in the data segment. ● In opposite to auto variables that live on the stack! – The C/C++ linker initializes static objects and its members with 0. ● Automatic variables are not getting initialized automatically! – Therefor we'll often hear about the automatic and static memory duration. ● In C/C++ there exist following storage classes: auto, static, register, extern and mutable. Esp. the storage classes extern and mutable need more discussion in future lectures.