How memory is allocated
Summary
tl;dr man 2 brk
Last year when I was learning assembler, I was asking myself how to allocate memory without malloc
. Usually memory is either allocated for us by our language, or we do it with new
or malloc
. But malloc
is a library function, it’s not a system call. How does malloc
itself get memory from the kernel? To answer that we need to look at the layout of a program in memory.
On Linux amd64, every process gets it’s own 128 Tb virtual address space. The program code, global data, debugging information and so on are loaded at the bottom of that space, working ‘upwards’ (bigger numeric addresses). Then comes the heap, where we are going to allocate some memory. Where the heap ends is called the program break. Then there is a very large gap, which the heap will grow into. At the top of the address space (0x7fffffffffff) is the stack, which will grow downwards, back towards the top of the heap. Here is a graphic of virtual memory layout
To allocate memory on the heap, we simply ask the kernel to move the program break up. The space between old program break and new program break is our memory. The system call is brk. First we have to find out where it is now. brk
returns the current position, so we simply have to call it. We pass it 0, which is an invalid value, so that it doesn’t change anything.
mov $12, %rax # brk syscall number
mov $0, %rdi # 0 is invalid, want to get current position
syscall
When that returns, the current position is in rax
. Let’s allocate 4 bytes, by asking the kernel to move our break up by four bytes:
mov %rax, %rsi # save current break
mov %rax, %rdi # move top of heap to here ...
add $4, %rdi # .. plus 4 bytes we allocate
mov $12, %rax # brk, again
syscall
We can now store anything we want at the address pointed at by rsi
, where we saved the start of our allocated space. Here is a full assembly program which puts “HI\n” into that space, and prints it out. alloc.s. Compile, link, run:
as -o alloc.o alloc.s
ld -o alloc alloc.o
./alloc
To free memory, you do the opposite, you move the break back down. That allows the kernel to re-use that space. Happy allocating!