moving data - a visual guide to x86-64 assemly
It’s often said that assembly language is complex.
After all, there’s a reason why high-level languages and compilers exist—to spare us the intricacies, right?
But while it’s true that you would have a hard time writing a large project in assembly, the language itself
is remarcably simple. That’s because Assembly is the language of the processor,
and at it’s essence, all the processor does is moving data.
This guide is not about writing assembly; it’s about understanding the way memory moves behind the scenes when you execute a program. We’ll use concrete examples for the x86-64 architecture, but these informations apply eveywhere and are foundamental knowledge for reverse engineering, binary exploitation, or simply program debugging
what is data?
Data is just bits, representing information. A sequence of bits can encode any kind of information, however this article will only focus on text and integers.
But before we talk about any kind of encoding, we have to introduce a new notation:
The issue is that while circuits understand sequences of bits very well, humans don’t.
For example, can you tell the difference between
1101010101111110
and 1101010101111110
?
Show answer
Ok, the two sequences are identical, but I bet you couldn’t immediately see that.
In order to visualize binary data in a more human friendly way, we use
hexadecimal numbers, which associate a number or a letter
between A and F to a group of 4 bits.
A long sequence of bits can be represented in this way:
0010 0101 0111 1101 1111
2 5 7 d f
Note that in order to avoid confusion with decimal numbers, it’s common to prefix
hexadecimal numbers with 0x
.
For example, 0x1234
is not the same
thing as the decimal number 1234
.
I’m not going to explain how conversions between decimal, binary, and hexadecimal numbers work,
The only assumption i’m making in this article is that you know that.
If you have a python terminal, you can perform these conversions very easily:
al@thinkpad:~/$ python
>>>
>>> 0b0010 #print the binary number 0010 in decimal
2
>>> 0x1234 #print the hex number 1234 in decimal
4660
>>> hex(0b00100101011111011111)
'0x257df'
>>> hex(4660)
'0x1234'
One last convention you should know is that a group of bits has a name, based on how long it is:
N. of bits | example hex value | name |
---|---|---|
4 | f | nibble |
8 | ff | byte |
16 | ffff | word |
32 | fffffff | dword (double word) |
64 | fffffffffffff | qword (quadruple word) |
text
You probably know
that there are different ways to encode text. The simplest encoding is ascii,
where text is stored as a sequence of bytes
and every byte represent a letter.
For example, the letter ‘c’ is stored as the byte 0x63
,
The letter ‘o’ is 0x6f
,
The text ciao
is stored as the sequence of bytes 63 69 61 6f
.
You can find a table of all ascii letters in the
linux man pages.
Text is a very good example of how data is usually encoded in the same order as you normally write it. For example, ‘ciao’ is stored as the byte for c, followed by the byte for i, and so on. You will see that this is not the case with numbers.
numbers
Integers in the x86-64 architecture are stored in little endian format.
Let’s take a decimal number, for example 3405691582:
Converted to binary, that number is 0xcafebabe
.
Unlike what we saw with text, those bytes will not be stored in order.
Instead, little endian architectures will take a number, split it by bytes,
and store those bytes in a reversed order.
0xcafebabe
will be store as:
be ba fe ca
For example, let’s say that you encounter the byte sequence 02 ff 00 00 00 00 00 00
,
what decimal number is it?
Show answer
- we need to take all the bytes, in reverse order. that’s
0x000000000000ff02
- exaclty like in decimal, leading zeros are meaningles, so we can remove them. we get
0xff02
- using python, we can convert it to decimal. we get:
65282
where is data?
Now that we know how to represent text and numbers, we need some place to store them. Like all kind of data, we can store it in only two places:
- in memory, which means in your RAM
- in registers, which are special containers inside your CPU
memory
Memory is just a very long list of contiguous cells, each containing 8 bits of information, and reachable by a numeric address.
Since printing a long list of bytes would take a lot of space, when visualizing memory we usually group bytes in rows of 8 or 16. It’s also common to include a column to the side that shows the ascii letter associated to each byte.
The memory dump below was taken from a program that was running on my computer. Use the slider to adjust the number of bytes you wanto to show in a row.
showing 1 byte per row
registers
Registers are containers for data, located inside your CPU. The x86-64 architecture has a lot of registers, each with an associated name. Some of them have a specific purpose, other are generic containers we can use in our program. We mostly interact with these one:
03 02
01 00
As you can see from the table above, registers that start with the prefix r
can store 8 bytes of data.
There are registers that give you access to the lower bytes of larger registers.
for example eax
gives you access to the lower 4 bytes of rax
.
moving data
The instruction mov
moves data around. It can move data from a register to another,
from a register to memory, or vice-versa from memory to a register
These first examples are self-explanatory:
mov rbx, 0x10 #copies an integer into rbx
mov rax, rbx #copies the content of rbx into rax
Moving data to memory requires some extra syntax:
Let’s say that our goal is to write the byte 0xff
in the memory cell at address 0x10
.
- First, we put in a register
0x10
, the address to the cell we want to write. - Then we perform a mov instruction with square brackets around the register name, to indicate that we want to move
0xff
in the memory address pointed by the register, and not into the register itself.
mov rax, 0x10
mov byte ptr [rax], 0xff
What if we wanted to write to memory more than one byte, for example the whole 8 bytes contained in rbx
?
This requires a variation of the ptr
syntax, to specify that we want to write a sequence of 8 bytes to memory.
The interactive example below shows all the possible variations of the ptr
syntax.
You can click “run” to see how the memory is affected
mov rbx, 0x4242424242424242
mov rax, 0x20
mov ptr [rax], bl #note: bl is the first byte of register rbx
the stack
x64, like most architectures, has the concept of stack: an area in memory pointed
by the special register rsp
.
You can add or remove elements from the top of the stack by using the
push
and pop
instructions. This is the most common interaction, but it’s also valid to directly adjust the value of rsp
.
This interactive example allows you to experiment with push
and pop
.
The stack area is highlighted in blue and displayed below, togehther with the contents of the rsp
and rax
registers.
mov rax, 0x4242424242424242
rax
There are two key elements you should notice by plaing with the example above:
rsp
points to the top of the stack. It is decreased by 8 when we push a value, and increased by 8 when we pop a value.- Every time we pop a value from the stack that value is not deleted, the area of memory that contains it
simply stops being part of the stack. The only thing that changes is the memory address pointed by
rsp
.
Basically, push rax
does the same as the following code:
sub rsp, 8
mov qword ptr [rsp], rax
And pop rax
does the same as the following code
mov rax, qword ptr [rsp]
add rsp, 8
There is a confusing element here: when we put something onto the stack we are growing the stack, and yet we are moving towards lower addresses of memory.
With the way we visualize memory this actually looks correct, the stack is growing
towards the top.
But if we only look at the numeric adresses of elements on the stack newer elements have smaller addresses,
which looks backwards.
Even when you are aware of this, it’s common to get confused
and end up thinking:
“i put a new value on the stack, and it has a smaller address than the previous value, how it is possible?”
Further Reading
This article is still under development. My goal is to improve it over time, adding a section that explains stack frames and integrations with a real x86_64 emulator.
If you reached this point, you might be interested in these additional resources:
- pwn.college’s assembly module and lectures https://pwn.college/fundamentals/assembly-crash-course
- the compiler explorer website https://godbolt.org/z/c6brc1df9
- the official x86_64 reference https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
- unofficial x86_64 instructions reference https://www.felixcloutier.com/x86/
- simply the best linux syscall table reference https://syscalls.mebeim.net/?table=x86/64/x64/latest