Fun with x86-64 assembly
Goal: Let’s compile a simple C program (using gcc
) and then see if we can directly modify the binary to make a change in the program. This is something that might be done in a first or second-year CS or SW Engineering course, but I didn’t take either of those. (Instead, in EE, I learned 68HC11 and MIPS assembly)
Let’s first define our simple program: To avoid all dependencies, we don’t write to stdout
but instead define an int
, increment it in a function, and then return that value as the program’s return code:
// To analyze machine code and manually modify.
void addTo(int *a) {
*a += 1;
}
int main() {
int a = 0;
addTo(&a);
return a;
}
When invoked, the program returns 1
as its error code:
Let’s then use Compiler Explorer to see what the resultant assembly looks like:
addTo:
push rbp ; push base pointer to stack
mov rbp, rsp ; set base pointer to stack pointer
mov QWORD PTR [rbp-8], rdi ; push float pointer to stack
mov rax, QWORD PTR [rbp-8] ; load float pointer into rax
mov eax, DWORD PTR [rax] ; load float value from pointer to eax (rax)
lea edx, [rax+1] ; increment rax by 1 and store in edx
mov rax, QWORD PTR [rbp-8] ; load float pointer into rax
mov DWORD PTR [rax], edx ; store edx into address in rax
nop ; probably for alignment
pop rbp ; restore base pointer from stack
ret ; return to caller
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 0
lea rax, [rbp-4]
mov rdi, rax
call addTo
mov eax, DWORD PTR [rbp-4]
leave
ret
This was compiled using x86-64 gcc 11.4, which is the version on my system, with no flags/optimizations, so the assembly is more “readable” and 1-1 with the C source.
I’ve added comments explaining the addTo
function. Some further explanation:
- The assembly syntax above is Intel ASM, not AT&T. This means, among other things, that the destination register comes first, and then any source register(s)!
- There are 16 general purpose (GP) registers in the x86-64 architecture. The registers are 64-bits in size, but there are 32-bit, 16-bit, and 8-bit counterparts that are just the lower bits of the same registers. So the 32-bit register
eax
is just the lower 32-bits of the 64-bitrax
register. - This explains the line with:
mov eax, DWORD PTR [rax]
It’s loading the FP32 value pointed at byrax
into theeax
, which is just the lower 32-bits ofrax
- The next instruction:
lea edx, [rax+1]
Uses “load effective address” to increment the value inrax
(oreax
) by 1, and then storing the result intoedx
. This constitutes the “meat” of our trivial function. - NOTE: If you look at the optimized code (compiled with
-O3 -march=native
) the assembly is a lot shorter: The function is reduced to just two instructions, one of which isret
!
Now let’s look at the binary output alongside the assembly. It looks something like this: (Apologies for the screenshot)
I want to modify the binary directly to change the value by which we increment a
. How can we do that? Let’s focus on the machine code for the instruction lea edx, [rax+1]
:
The machine code is:
8d 50 01
Here’s what that means:
8d
: This is the opcode forlea
or Load Effective Address, which can perform a scale/shift of one register, add it to another, and then add in an offset/displacement. In this case, we only add in the offset/displacement of 1.50
: In binary this is0101 0000
. This is the “ModR/M Byte” which contains three things:- First two bits:
01
: The addressing mode. In this case, it’s just a base register plus an 8-bit displacement - Second 3 bits:
010
: Register Operand OR extended Opcode Data. In our case, it specifies the destination register operand,edx
. - Third 3 bits:
000
: Second register operand OR addressing method. In our case, it specifics the base registerrax
.
- First two bits:
01
: This is the 8-bit displacement value that we added torax
.
So, to modify the increment value, we just need to find the instruction 8d 50 01
in our binary, and change the last byte to whatever we want to increment a
by!
If we open our binary in a hex editor, it turns out there’s a lot of other stuff in there, but we just have to do some searching to find the machine code relevant to our addTo
function. When compiled with gcc
(again with no flags/optimization) on Ubuntu 22.04.3 LTS running on WSL 2, it was somewhere around address 00001150
:
Let’s modify the 01
to be 22
or 34 in decimal:
Then, re-running our program, we see 34
being returned as the error code. Neat!