This transcript will help you understand the calling conventions and the ABI. It's taken on a 64 bit system; do the same on a 32bit system and observe the differences. For these exercises, it is important that you learn to read disassembly and understand the basics of the x86 platform. A gentle introduction is provided in the "Solaris on x86" book in the course directory. Read Chapters 2 and 3 and work through examples in Chapter 3 with your own compiler, objdump, and debugger if you are not familiar with x86! (You should skip SPARC and Solaris-specific examples for now, but look at how C code structs and algorithmic structures are compiled.) We spoke about privilege rings and memory translation in ia32 and x86-64 in class. Gustavo Duarte's blog makes nice introductory reading, with beautiful pictures of relevant data structures. http://duartes.org/gustavo/blog/post/memory-translation-and-segmentation/ http://duartes.org/gustavo/blog/post/cpu-rings-privilege-and-protection/ Chapter 4 of the "Solaris on x86" has more details about these topics. In case you are already familiar with the 32-bit assembly, then a good if a bit terse intro to the new things in the 64-bit can be found at http://www.x86-64.org/documentation/assembly.html -----[ A system call in 64 bits ]------- sergey@ubuntu64:/home/sergey$ cat hello.c #include int main() { printf("Hello!\n"); return 42; } Before we start, observe the results of stopping compilation after the preprocessing stage (gcc -E) and after the assembly generation stage (gcc -S): sergey@ubuntu64:/home/sergey$ gcc -E hello.c | less sergey@ubuntu64:/home/sergey$ gcc -S hello.c // makes hello.s We'll ignore the .cfi_* directives for now; these are for the DWARF-based exception handling. sergey@ubuntu64:/home/sergey$ cat hello.s | grep -v .cfi .file "hello.c" .section .rodata .LC0: .string "Hello!" .text .globl main .type main, @function main: .LFB0: pushq %rbp movq %rsp, %rbp movl $.LC0, %edi call puts movl $42, %eax popq %rbp ret .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1" .section .note.GNU-stack,"",@progbits This assembly code will be turned by the GNU assembler (gas) and then the linker (ld) into the code you see when you do sergey@ubuntu64:/home/sergey$ gcc -Wall -o hello hello.c sergey@ubuntu64:/home/sergey$ objdump -d hello sergey@ubuntu64:/home/sergey$ cat exec.c #include int main() { char *const args[] = {"/bin/ls", NULL}; // see execv(3) for arguments execv("/bin/ls", args); } In the following code, understand the representation of the args[]: where and how the pointers are stored? sergey@ubuntu64:/home/sergey$ gcc -S exec.c sergey@ubuntu64:/home/sergey$ cat exec.s | grep -v .cfi_ .file "exec.c" .section .rodata .LC0: .string "/bin/ls" .text .globl main // created the symbol main .type main, @function // in the symbol table, of type function; cf. readelf -a exec main: .LFB0: pushq %rbp movq %rsp, %rbp subq $16, %rsp movq $.LC0, -16(%rbp) movq $0, -8(%rbp) leaq -16(%rbp), %rax movq %rax, %rsi movl $.LC0, %edi call execv leave ret .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1" .section .note.GNU-stack,"",@progbits Now I am going to walk the code to the actual point where the exec() system call is made. Notice that I am cheating here, by compiling the executable statically (and with debug info). Most executables get compiled dynamically these days; static libraries no longer ship by default on most distributions. This cheat allows me to bypass an extra layer of indirection introduced by dynamic linking, though. sergey@ubuntu64:/home/sergey$ gcc -static -g -o exec exec.c sergey@ubuntu64:/home/sergey$ gdb ./exec GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08 Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /home/sergey/exec...done. (gdb) b main Breakpoint 1 at 0x40104c: file exec.c, line 5. (gdb) r Starting program: /home/sergey/exec Breakpoint 1, main () at exec.c:5 5 char *const args[] = {"/bin/ls", NULL}; (gdb) disas main Dump of assembler code for function main: 0x0000000000401044 <+0>: push %rbp // standard function preamble 0x0000000000401045 <+1>: mov %rsp,%rbp // for creating a stack frame 0x0000000000401048 <+4>: sub $0x10,%rsp // and reserving stack space for it => 0x000000000040104c <+8>: movq $0x488be4,-0x10(%rbp) // string address copied into local struct args 0x0000000000401054 <+16>: movq $0x0,-0x8(%rbp) // and the NULL 0x000000000040105c <+24>: lea -0x10(%rbp),%rax // this is the value of pointer "args" 0x0000000000401060 <+28>: mov %rax,%rsi // passed as 2nd arg to execv() 0x0000000000401063 <+31>: mov $0x488be4,%edi // address of "/bin/ls", the 1st argument to execv() 0x0000000000401068 <+36>: callq 0x40cfc0 0x000000000040106d <+41>: leaveq 0x000000000040106e <+42>: retq End of assembler dump. // execv() is actually just a form of execve() with a 3rd argument added. See "man execv". (gdb) disas execv Dump of assembler code for function execv: 0x000000000040cfc0 <+0>: mov 0x2a8941(%rip),%rdx # 0x6b5908 // 3rd arg to execve 0x000000000040cfc7 <+7>: jmpq 0x4445b0 End of assembler dump. (gdb) disas execve Dump of assembler code for function execve: 0x00000000004445b0 <+0>: mov $0x3b,%eax // number of syscall table entry for execve 0x00000000004445b5 <+5>: syscall 0x00000000004445b7 <+7>: cmp $0xfffffffffffff000,%rax // See "RETURN VALUE" in "man execve" 0x00000000004445bd <+13>: ja 0x4445c1 0x00000000004445bf <+15>: repz retq 0x00000000004445c1 <+17>: mov $0xffffffffffffffb0,%rdx 0x00000000004445c8 <+24>: neg %eax 0x00000000004445ca <+26>: mov %eax,%fs:(%rdx) // errno being set 0x00000000004445cd <+29>: or $0xffffffffffffffff,%rax // "-1" is returned 0x00000000004445d1 <+33>: retq End of assembler dump. execve does not return on success, so what you see after the syscall instruction is the error handling logic: setting errno and returning -1 as per the manpage. Note the 'syscall' instruction. This is a new optimized x86-64 instruction that causes a software interrupt and jumps to the system call dispatcher entry in the kernel, at the same time setting the code privilege level to Ring0. The kernel entry address is stored in a Model-specific Register (MSR). Recall that in ia32 the system call used a software interrupt 0x80 ("int 0x80"), and the entry address was taken from the 0x80-th entry of the Interrupt Descriptor Table (IDT), pointed to by the dedicated IDTR register. So the new 64-bit scheme has less apparent indirection. More about 64-bit systems calls: http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64 Note the hex 3b above. This is the number of the exec system call in the 64-bit table. The tables and numbers for 32-bit system calls and 64-bit system calls are completely independent, and all the numbers are different. The sys_call_table used to be a straight-up array of function pointers. In newer Linux kernels, it's generated by a shell script from a table: (my VM kernel version is 3.12, so I use that version here) http://lxr.free-electrons.com/source/arch/x86/syscalls/syscalltbl.sh?v=3.12 http://lxr.free-electrons.com/source/arch/x86/syscalls/syscall_64.tbl?v=3.12 http://lxr.free-electrons.com/source/arch/x86/syscalls/syscall_32.tbl?v=3.12 The generated syscall header file is pulled in by #include near the bottom of http://lxr.free-electrons.com/source/arch/x86/kernel/syscall_64.c?v=3.12 For 4.* kernels, these files have changes location, and are now under http://lxr.free-electrons.com/source/arch/x86/entry/ and http://lxr.free-electrons.com/source/arch/x86/entry/syscalls/ Exercises: Find the system call in Linux source code (e.g., at http://lxr.free-electrons.com/source/) and convince yourself that it's indeed exec(). Find the kernel function that implements exec(). Repeat compilation and interpret the results with other GCC options: -O (optimization level), -fomit-frame-pointer , -fPIC etc. See 'man gcc' if in doubt -- there are many kinds of options.