Skip to main content

Memory layout

Memory Management is one of the most important topics for a Programmer, and so understanding the Memory Layout of a C Program and Memory Layout of a Process becomes essential.

For high-level languages such as Java, Python, C#, Memory is partially managed by the language itself as it has a Garbage Collector, which deallocates and frees the allocated memory while not in use. But there is no such garbage collector in C & C++, and so the programmer must manually release the allocated memory.

The C program is first compiled and translated to an executable object file. When the executable is run, it takes the main memory area, i.e. the RAM, and the CPU runs the executable instructions.

If you are not aware of the processes involved in compiling the C program from source to binary, read C Program Compilation Process.

The Typical Memory Layout of a C Program consists of the following segments:

  1. Command Line Arguments
  2. Stack
  3. Heap
  4. Uninitialized Data Segment (BSS)
  5. Initialized Data Segment
  6. Text/Code Segment

layout

The above layout segments can be broadly classified into two:

  1. Static Memory Layout – Text/Code, Data Segments
  2. Dynamic Memory Layout – Stack & Heap

The C Program executable already contains some of the segments, and some are built dynamically at runtime.

First Let’s Discuss each segment of the Memory Layout in detail:

Static Memory Layout

The Static Memory layout consists of three segments, Text/Code segment, Initialized, and Uninitialized (bss) Data Segment. These three segments are already present in the final executable object file of the c program and are directly copied to the main memory layout.

We can use the size tool to take a look at the static memory layout of the c program executable object file.

Let’s take a look:

example c src
#include <stdlib.h>

int main() { return 0; }
command and output
size a.out

text data bss dec hex filename
1136 512 8 1656 678 a.out

Text/Code Segment

Text or Code Segment includes the machine-level instructions for the final executable object file. This section is one of the key parts of the static memory structure as it includes the program’s central logic.

The text segment in the memory structure is below the heap and the data segment. This layout is chosen to shield the Text section from overwriting if the stack or heap overflows.

In the text section of the final executable object file, we only have read and execute permissions and no write permissions. This is done to prevent accidental modifications to the corresponding assembly code.

objdump -S a.out
a.out:     file format elf64-x86-64


Disassembly of section .init:

0000000000001000 <_init>:
1000: f3 0f 1e fa endbr64
1004: 48 83 ec 08 sub $0x8,%rsp
1008: 48 8b 05 d9 2f 00 00 mov 0x2fd9(%rip),%rax # 3fe8 <__gmon_start__@Base>
100f: 48 85 c0 test %rax,%rax
1012: 74 02 je 1016 <_init+0x16>
1014: ff d0 call *%rax
1016: 48 83 c4 08 add $0x8,%rsp
101a: c3 ret

Disassembly of section .text:

0000000000001020 <_start>:
1020: f3 0f 1e fa endbr64
1024: 31 ed xor %ebp,%ebp
1026: 49 89 d1 mov %rdx,%r9
1029: 5e pop %rsi
102a: 48 89 e2 mov %rsp,%rdx
102d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
1031: 50 push %rax
1032: 54 push %rsp
1033: 45 31 c0 xor %r8d,%r8d
1036: 31 c9 xor %ecx,%ecx
1038: 48 8d 3d da 00 00 00 lea 0xda(%rip),%rdi # 1119 <main>
103f: ff 15 93 2f 00 00 call *0x2f93(%rip) # 3fd8 <__libc_start_main@GLIBC_2.34>
1045: f4 hlt
1046: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
104d: 00 00 00

0000000000001050 <deregister_tm_clones>:
1050: 48 8d 3d d1 2f 00 00 lea 0x2fd1(%rip),%rdi # 4028 <__TMC_END__>
1057: 48 8d 05 ca 2f 00 00 lea 0x2fca(%rip),%rax # 4028 <__TMC_END__>
105e: 48 39 f8 cmp %rdi,%rax
1061: 74 15 je 1078 <deregister_tm_clones+0x28>
1063: 48 8b 05 76 2f 00 00 mov 0x2f76(%rip),%rax # 3fe0 <_ITM_deregisterTMCloneTable@Base>
106a: 48 85 c0 test %rax,%rax
106d: 74 09 je 1078 <deregister_tm_clones+0x28>
106f: ff e0 jmp *%rax
1071: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
1078: c3 ret
1079: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)

0000000000001080 <register_tm_clones>:
1080: 48 8d 3d a1 2f 00 00 lea 0x2fa1(%rip),%rdi # 4028 <__TMC_END__>
1087: 48 8d 35 9a 2f 00 00 lea 0x2f9a(%rip),%rsi # 4028 <__TMC_END__>
108e: 48 29 fe sub %rdi,%rsi
1091: 48 89 f0 mov %rsi,%rax
1094: 48 c1 ee 3f shr $0x3f,%rsi
1098: 48 c1 f8 03 sar $0x3,%rax
109c: 48 01 c6 add %rax,%rsi
109f: 48 d1 fe sar %rsi
10a2: 74 14 je 10b8 <register_tm_clones+0x38>
10a4: 48 8b 05 45 2f 00 00 mov 0x2f45(%rip),%rax # 3ff0 <_ITM_registerTMCloneTable@Base>
10ab: 48 85 c0 test %rax,%rax
10ae: 74 08 je 10b8 <register_tm_clones+0x38>
10b0: ff e0 jmp *%rax
10b2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
10b8: c3 ret
10b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)

00000000000010c0 <__do_global_dtors_aux>:
10c0: f3 0f 1e fa endbr64
10c4: 80 3d 5d 2f 00 00 00 cmpb $0x0,0x2f5d(%rip) # 4028 <__TMC_END__>
10cb: 75 33 jne 1100 <__do_global_dtors_aux+0x40>
10cd: 55 push %rbp
10ce: 48 83 3d 22 2f 00 00 cmpq $0x0,0x2f22(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5>
10d5: 00
10d6: 48 89 e5 mov %rsp,%rbp
10d9: 74 0d je 10e8 <__do_global_dtors_aux+0x28>
10db: 48 8b 3d 3e 2f 00 00 mov 0x2f3e(%rip),%rdi # 4020 <__dso_handle>
10e2: ff 15 10 2f 00 00 call *0x2f10(%rip) # 3ff8 <__cxa_finalize@GLIBC_2.2.5>
10e8: e8 63 ff ff ff call 1050 <deregister_tm_clones>
10ed: c6 05 34 2f 00 00 01 movb $0x1,0x2f34(%rip) # 4028 <__TMC_END__>
10f4: 5d pop %rbp
10f5: c3 ret
10f6: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
10fd: 00 00 00
1100: c3 ret
1101: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
1108: 00 00 00 00
110c: 0f 1f 40 00 nopl 0x0(%rax)

0000000000001110 <frame_dummy>:
1110: f3 0f 1e fa endbr64
1114: e9 67 ff ff ff jmp 1080 <register_tm_clones>

0000000000001119 <main>:
1119: 55 push %rbp
111a: 48 89 e5 mov %rsp,%rbp
111d: b8 00 00 00 00 mov $0x0,%eax
1122: 5d pop %rbp
1123: c3 ret

Disassembly of section .fini:

0000000000001124 <_fini>:
1124: f3 0f 1e fa endbr64
1128: 48 83 ec 08 sub $0x8,%rsp
112c: 48 83 c4 08 add $0x8,%rsp
1130: c3 ret

Initialized Data Segment

All initialized global and static variables are stored in this section.

The data segment has read and write permissions. This allows the program to execute and change the value of the variable in the data segment at runtime.

We add some variables to our program

#include <stdlib.h>

int number = 10;
char example = 'C';
int numbers[4] = {1,2,3,4};

int main() {
return 0;
}
gcc main.c
size a.out
text data bss dec hex filename
1136 544 8 1688 698 a.out

After adding these variables the data segment grew.

Uninitialized Data Segment (BSS)

The Uninitialized Data Section, also known as the “bss” segment, was named after an old assembly operator that stands for “block started by the symbol“.

The BSS Segment contains all the uninitialized global variables and static variables. This segment is placed above the data segment in the memory layout.

This segment also has both the read and write permissions.

#include <stdlib.h>

int a, b, c;
char ch;

int main() { return 0; }
 size a.out
text data bss dec hex filename
1136 512 24 1672 688 a.out

This time size of the bss segment increased from 8 bytes to 24 bytes, because we declared global variables but didn’t initialize it.

Dynamic Memory Layout

This is the runtime memory of the process and exists as long as the process is running.

DML

Stack

Program execution can take place without a heap memory, but not without a stack segment. This illustrates the importance of stack memory for the execution of a program.

The stack is a region of memory in the process’s virtual address space where data is added or removed in the Last-in-First-out (LIFO) order.

A new stack-frame is added to the stack memory when a new function is invoked. The corresponding stack-frame is removed when the function returns.

One thing to note here is that every function has its own stack-frame, also known as Activation record.

The size of the stack is variable since it depends on the size of the local variables, parameters, and function calls. The Stack grows from a higher address to a lower address.

Every process has its own fixed/configurable stack memory. The stack memory is reclaimed by the OS when the process terminates.

Using the ulimit -s command, we can see the max size of stack memory in the Linux system.

ulimit -s
8192

Use ulimit -a command to list all the flags for the ulimit command.

ulimit -a

-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) unlimited
-m: resident set size (kbytes) unlimited
-u: processes 127950
-n: file descriptors 1024
-l: locked-in-memory size (kbytes) 8192
-v: address space (kbytes) unlimited
-x: file locks unlimited
-i: pending signals 127950
-q: bytes in POSIX msg queues 819200
-e: max nice 0
-r: max rt priority 0
-N 15: rt cpu time (microseconds) unlimited

To find the limits of a running process in Linux, use cat /proc//limits command.

Create a C program with an infinite loop.

endless
int main() {
while(1){}
}

Run the executable object file in the background, it will give us the process id of the process. Use the process id to get the limits of the process.

Kill the background running process, or it will run indefinitely.

./a.out&
[1] 74395
cat /proc/74395/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 127950 127950 processes
Max open files 1024 524288 files
Max locked memory 8388608 8388608 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 127950 127950 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us

Let’s find the Stack Size using C Program.

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/resource.h>

int a, b, c;
char ch;

int main() {
struct rlimit lim;
if (getrlimit(RLIMIT_STACK, &lim) == 0) {
printf("Soft Limit = %ld\n", lim.rlim_cur);
printf("Max Stack Size = %ld\n", lim.rlim_max);
} else {
printf("%s\n", strerror(errno));
}
return 0;
}
Soft Limit = 8388608
Max Stack Size = -1

Let’s now see the stack memory layout and what the stack frame for a function contains. Stack Memory Layout

A Stack frame contains four types of information:

  1. Parameters passed to the function (Reverse Order)
  2. The return address of the caller function.
  3. The base pointer of the caller function
  4. Local variables of the function

The size of the return address and base pointer is 4 bytes for 32-bit architecture and 8 bytes for 64-bit architecture.

stack layout example program
#include <stdio.h>

int sum(int a, int b) {
return a + b;
}

float avg(int a, int b) {
int s = sum(a, b);
return (float)s / 2;
}

int main() {
int a = 10;
int b = 20;
printf("Average of %d, %d = %f\n", a, b, avg(a, b));
return 0;
}

StackLayout

The frame that is being executed is always the topmost frame of the stack. The pointer to the top-most frame in the stack is called the Frame Pointer or Base Pointer. The Base Pointer stores the starting address in callee’s stack frame where the caller’s base pointer value is copied.

The pointer to the top of the stack is called the Stack Pointer. Stack Pointer stores the address of the top of the stack memory.

The stack memory has automatic memory management for both allocation and de-allocation. The programmer has no control over the memory of the stack. When constructing a stack-frame, the local variable of the function is allocated and de-allocated when the stack-frame is about to pop up from the stack segment. This also defines the scope of a variable. Stack Error Conditions

Let’s take a look at what errors we can face when dealing with the stack.

Stack Overflow

This is an error when a program has a long sequence of function calls, and the program stack expands past the full fixed size, resulting in a stack overflow.

What causes stack overflow condition:

  1. Recursive function calls
  2. Declaration of large arrays

Stack Memory has a limited size and thus it is not recommended to store large objects.

Stack Corruption

Stack corruption is a condition in which we corrupt the stack data by copying more data than the actual memory capacity.

Example:

stack corruption
#include <stdio.h>
#include <string.h>

void copy(char *argv) {
char name[10];
strcpy(name, argv);
}

int main(int argc, char **argv) {
copy(argv[1]);
printf("Exit\n");
return 0;
}

There is a copy function in the above code where a name array of 10 bytes of the char data type has been specified. And we’re copying data from the argument on the command line. If the user passes a string with a size larger than 10 bytes, the stack frame will overwrite another block and this will lead to stack corruption.

Heap

As we’ve seen, the stack has a limited size that doesn’t allow us to work with big data, and we don’t have control over it. This problem is solved by the Heap memory, a continuous part of virtual address space where the allocation and de-allocation of memory can be performed in real-time.

Unlike stack memory there is no such automatic memory management and the allocation and de-allocation of heap memory is the primary responsibility of the programmer.

To harness the heap memory, we need the Glibc API, which provides the functions to allocate and de-allocate the heap memory.

The malloc()/calloc() function is used to assign a memory block from the heap segment and the free() function is used to restore the memory to the heap segment that was assigned by the malloc()/calloc() function.

Under the hood, the malloc() and calloc() functions use the brk() and sbrk() system calls to allocate and de-allocate the heap memory for a process.

brk/sbrk

These functions malloc, calloc, realloc, and free are defined in the header file, stdlib.h.

One factor to keep in mind is that we can only use pointers to address a heap memory block.

Now let’s see an example of how the heap memory is allocated and de-allocated.

malloc

allocates a memory block of given size (in bytes) and returns a pointer to the beginning of the block. malloc() doesn’t initialize the allocated memory. If you try to read from the allocated memory without first initializing it, then you will invoke undefined behavior, which will usually mean the values you read will be garbage.

calloc

allocates the memory and also initializes every byte in the allocated memory to 0. If you try to read the value of the allocated memory without initializing it, you’ll get 0 as it has already been initialized to 0 by calloc().

#include <stdio.h>
#include <stdlib.h>



void func() {
int a = 10;
int *aptr = &a;
int *ptr = (int *)malloc(sizeof(int));
*ptr = 20;
printf("Heap Memory Value = %d\n", *ptr);
printf("Pointing in Stack = %d\n", *aptr);
free(ptr);
}

int main() {
func();
return 0;
}

heap-mem

The image above is a simple description of how a heap of memory is accessed using a malloc() function call. The picture indicates that the value of integer 20 is stored in the 4 Byte of heap area allocated by the malloc() function, but that is not really true. The value is actually stored in the physical memory, i.e. the RAM, the virtual address of the heap segment is converted to the physical address using the MMU (Memory Management Unit), and the value is written or accessed.

The heap memory block has no scope, so the programmer has to manually free the reserved space from the heap.