Args and Env Vars in Assembler
26 April 2020
Where do a Linux program's arguments and environmental variables come from? That is, where are they in the process's memory when the process starts up?
There is a document called System V Application Binary Interface AMD64 Architecture Processor Supplement that shows what a Linux program's memory looks like when it starts up.
It turns out that when a Linux process starts up, its stack is not empty! In fact, the stack has pointers to all of the arguments and environmental variables, like so:
top of memory (highest addresses) "bottom" of stack (stack grows down towards lower addresses) ┌───────────────────────────────────────────────────────────┐ │ │ other stuff │ │ ├───────────────────────────────────────────────────────────┤ │ 64-bit zero │ ├───────────────────────────────────────────────────────────┤ │ pointer to env var N (address of first char of env var N) │ ├───────────────────────────────────────────────────────────┤ ... ├───────────────────────────────────────────────────────────┤ │ pointer to env var 3 (address of first char of env var 3) │ ├───────────────────────────────────────────────────────────┤ │ pointer to env var 2 (address of first char of env var 2) │ ├───────────────────────────────────────────────────────────┤ │ pointer to env var 1 (address of first char of env var 1) │ ├───────────────────────────────────────────────────────────┤ │ 64-bit zero │ ├───────────────────────────────────────────────────────────┤ │ pointer to arg N (address of first char of arg N) │ ├───────────────────────────────────────────────────────────┤ ... ├───────────────────────────────────────────────────────────┤ │ pointer to arg 3 (address of first char of arg 3) │ ├───────────────────────────────────────────────────────────┤ │ pointer to arg 2 (address of first char of arg 2) │ ├───────────────────────────────────────────────────────────┤ │ pointer to arg 1 (address of first char of arg 1) │ ├───────────────────────────────────────────────────────────┤ │ pointer to program name / arg 0 (address of first char) │ ├───────────────────────────────────────────────────────────┤ │ number of args (as a 64-bit integer) │ └───────────────────────────────────────────────────────────┘ "top" of stack (stack grows down towards lower addresses)
Of course, any modern programming language has access to these arguments
and environmental variables. In Go, you get the arguments through
os.Args
, and the environmental variables through
os.GetEnv("SOME_VAR")
.
I even learned that in C, the main
function
can have this signature to get access to the environmental
variables!
int main(int argc, char *argv[], char * envp[])
Of course, if we really want to prove that the stack is laid out this way at program startup, we could always use a small assembler program to prove it.
Here's a Linux/amd64 assembler program that does just that.
Build the program like so:
$ as --gstabs args_and_env.s -o args_and_env.o $ ld args_and_env.o -o args_and_env
And the code itself (with a legend for the Linux/amd64 calling conventions, which I can never remember):
# LEGEND # ------ # Linux/amd64 C calling convention: # args: RDI, RSI, RDX, RCX, R8, R9 # return value in RAX # destroyed registers: all args! (RDI, RSI, RDX, RCX, R8, R9) # # Linux/amd64 Syscall convention: # syscall number in RAX # args: RDI, RSI, RDX, R10, R8, R9 # syscall return value in RAX # destroyed registers: RCX and R11 .section .data newline: .ascii "\n\0" program_label: .ascii "Program: \0" arg_label: .ascii "Arg: \0" env_label: .ascii "Env: \0" .section .text .globl _start _start: # Save the stack pointer to the base pointer so that if we have to push/pop # things from the stack in _start, we can still access the args and env vars # relative %rbp movq %rsp, %rbp # Print the word "Program: " with no newline movq $program_label, %rdi call print_string # 8(%rsp), the second item from the top the stack, contains a pointer to the program name movq 8(%rbp), %rdi call print_string movq $newline, %rdi call print_string # we go 1 quad word into the stack because the first quad word is ARGV and the second QUAD # is a pointer to the program name. Quad words before that (deeper in the stack) # are pointers to args, terminated by a quad word that is zero. movq $1, %rsi arg_loop: # Increment %rsi so that when we can point one element deeper in the stack (the pointer to the next arg) # Even on the first loop we can do this, because the first pointer on the stack is the program name, # which we want to skip over to get either the pointer to the first arg, or a null pointer if there are no args # and we have hit the end of the arg list. incq %rsi # Get the address that is offset 8 bytes * %rsi from the top of the stack (%rbp); # That is, get the pointer to the next arg (or a null pointer, terminating the args). movq (%rbp, %rsi, 8), %rdi # Is this a null pointer (is %rdi zero)? cmpq $0, %rdi # If it is a null pointer, we are done; go on to print env vars. je env_loop # If we get this far, %rdi has a pointer to the beginning of an arg. # A C-style func is allowed to clobber %rsi and %rdi, so push them both # onto the stack to save their current values for use later. pushq %rsi pushq %rdi movq $arg_label, %rdi call print_string # Get the saved value of %rdi (pointer to an arg) from the stack and put it back into %rdi # so that the next call to print_string prints the arg. popq %rdi call print_string movq $newline, %rdi call print_string # Now that our C-style function calls that have clobbered %rsi # are done, pop the saved value of %rsi back into %rsi. popq %rsi # Note that %rdi has likely been clobbered again, but we don't care because we are # going to overwrite it anyway at the top of this loop. jmp arg_loop env_loop: # Increment %rsi so that when we can point one element deeper in the stack (the pointer to the next arg) # When we first enter this loop, %rsi would be the null pointer delineating the end of args, # so we would want to advance past that. incq %rsi # Get the address that is offset 8 bytes * %rsi from the top of the stack (%rsp); # That is, get the pointer to the next env var (or a null pointer, terminating the env vars). movq (%rbp, %rsi, 8), %rdi # Is this a null pointer (is %rdi zero)? cmpq $0, %rdi # If it is a null pointer, we are done; quit the program. je quit_program # If we get this far, %rdi has a pointer to the beginning of an env var. # A C-style func is allowed to clobber %rsi and %rdi, so push them both # onto the stack to save their current values for use later. pushq %rsi pushq %rdi movq $env_label, %rdi call print_string # Get the saved value of %rdi (pointer to an env var) from the stack and put it back into %rdi # so that the next call to print_string prints the env var. popq %rdi call print_string movq $newline, %rdi call print_string # Now that our C-style function calls that have clobbered %rsi # are done, pop the saved value of %rsi back into %rsi. popq %rsi # Note that %rdi has likely been clobbered again, but we don't care because we are # going to overwrite it anyway at the top of this loop. jmp env_loop quit_program: movq $60, %rax # 60 is the exit syscall movq $0, %rdi # 0 is the status number we will return, first arg to exit syscall syscall .type strlen, @function strlen: pushq %rbp # save the old base pointer movq %rsp, %rbp # make the stack pointer the base pointer # start the accumulator (and, usefully, our return register) at 0 # We will accumulate the byte count in %rax. movq $0, %rax # zero out %rcx before we use it. We will be storing each byte we read # into %cl, the lowest byte of the 8-byte %rcx register. movq $0, %rcx # arg1 is rdi, which is the address of the first byte of the null-terminated string. movb (%rdi), %cl # load the first byte of the string into $cl cmpb $0, %cl je end_strlen strlen_loop: # add one to the byte count incq %rax # move our pointer one byte ahead as well incq %rdi # load the pointed-to byte of the string into %cl movb (%rdi), %cl # check to see if the pointed-to byte is zero cmpb $0, %cl # if the byte was zero, jump to the end je end_strlen # if the byte was not zero, jump to the top of the loop again jmp strlen_loop end_strlen: leave ret .type print_string, @function print_string: pushq %rbp # save the old base pointer movq %rsp, %rbp # make the stack pointer the base pointer # %rdi has the pointer to the first byte of our string. # Save it on stack just in case strlen modifies it. pushq %rdi # Get the length of the string we want to print. # conveniently, %rdi already has the address of the first byte # of the string we need to get the length of, so we can call strlen # without even having to set up %rdi again. call strlen # After the call to strlen, %rax has the length of the string. # It is the 4th arg to the write system call, going into %rsi. movq %rax, %rdx # After the call to strlen, %rdi may have been modified, so pop # its original value back off the stack but put it into %rsi, the third # argument to the write system call (pointer to first char of string). popq %rsi movq $1, %rax # 1 is the write system call movq $1, %rdi # 1 is the stdout filehandle, first arg to write syscall # %rsi already contains pointer to first char of string # %rdx already contains the length of the string we want to print syscall # return 0 from this function (could also consider returning bytes written) movq $0, %rax leave ret
Here's what a sample run looks like:
$ ./args_and_env foo bar baz Program: ./args_and_env Arg: foo Arg: bar Arg: baz Env: CLUTTER_IM_MODULE=xim Env: LS_COLORS=di=34:fi=0:ln=35:pi=31:so=31:bd=31:cd=31:or=41:mi=41:ex=32 Env: XDG_MENU_PREFIX=gnome- Env: LANG=en_US.UTF-8 ...