Args and Env Vars in Assembler

26 April 2020

Where do a Linux program's arguments and environmental variables come from? That is, where are they in the process's memory when the process starts up?

There is a document called System V Application Binary Interface AMD64 Architecture Processor Supplement that shows what a Linux program's memory looks like when it starts up.

It turns out that when a Linux process starts up, its stack is not empty! In fact, the stack has pointers to all of the arguments and environmental variables, like so:

                top of memory (highest addresses)
"bottom" of stack (stack grows down towards lower addresses)
┌───────────────────────────────────────────────────────────┐
│                                                           │

                         other stuff

│                                                           │
├───────────────────────────────────────────────────────────┤
│                        64-bit zero                        │
├───────────────────────────────────────────────────────────┤
│ pointer to env var N (address of first char of env var N) │
├───────────────────────────────────────────────────────────┤
                            ...
├───────────────────────────────────────────────────────────┤
│ pointer to env var 3 (address of first char of env var 3) │
├───────────────────────────────────────────────────────────┤
│ pointer to env var 2 (address of first char of env var 2) │
├───────────────────────────────────────────────────────────┤
│ pointer to env var 1 (address of first char of env var 1) │
├───────────────────────────────────────────────────────────┤
│                        64-bit zero                        │
├───────────────────────────────────────────────────────────┤
│     pointer to arg N (address of first char of arg N)     │
├───────────────────────────────────────────────────────────┤
                            ...
├───────────────────────────────────────────────────────────┤
│     pointer to arg 3 (address of first char of arg 3)     │
├───────────────────────────────────────────────────────────┤
│     pointer to arg 2 (address of first char of arg 2)     │
├───────────────────────────────────────────────────────────┤
│     pointer to arg 1 (address of first char of arg 1)     │
├───────────────────────────────────────────────────────────┤
│  pointer to program name / arg 0 (address of first char)  │
├───────────────────────────────────────────────────────────┤
│            number of args (as a 64-bit integer)           │
└───────────────────────────────────────────────────────────┘
  "top" of stack (stack grows down towards lower addresses)

Of course, any modern programming language has access to these arguments and environmental variables. In Go, you get the arguments through os.Args, and the environmental variables through os.GetEnv("SOME_VAR").

I even learned that in C, the main function can have this signature to get access to the environmental variables!

int main(int argc, char *argv[], char * envp[])

Of course, if we really want to prove that the stack is laid out this way at program startup, we could always use a small assembler program to prove it.

Here's a Linux/amd64 assembler program that does just that.

Build the program like so:

$ as --gstabs args_and_env.s -o args_and_env.o
$ ld args_and_env.o -o args_and_env

And the code itself (with a legend for the Linux/amd64 calling conventions, which I can never remember):

# LEGEND
# ------
# Linux/amd64 C calling convention:
#   args: RDI, RSI, RDX, RCX, R8, R9
#   return value in RAX
#   destroyed registers: all args! (RDI, RSI, RDX, RCX, R8, R9)
#
# Linux/amd64 Syscall convention:
#   syscall number in RAX
#   args: RDI, RSI, RDX, R10, R8, R9
#   syscall return value in RAX
#   destroyed registers: RCX and R11

.section .data

newline:
.ascii "\n\0"

program_label:
.ascii "Program: \0"

arg_label:
.ascii "Arg: \0"

env_label:
.ascii "Env: \0"

.section .text
.globl _start
_start:

# Save the stack pointer to the base pointer so that if we have to push/pop
# things from the stack in _start, we can still access the args and env vars
# relative %rbp
movq %rsp, %rbp

# Print the word "Program: " with no newline
movq $program_label, %rdi
call print_string

# 8(%rsp), the second item from the top the stack, contains a pointer to the program name
movq 8(%rbp), %rdi
call print_string
movq $newline, %rdi
call print_string

# we go 1 quad word into the stack because the first quad word is ARGV and the second QUAD
# is a pointer to the program name. Quad words before that (deeper in the stack)
# are pointers to args, terminated by a quad word that is zero.
movq $1, %rsi
arg_loop:
  # Increment %rsi so that when we can point one element deeper in the stack (the pointer to the next arg)
  # Even on the first loop we can do this, because the first pointer on the stack is the program name,
  # which we want to skip over to get either the pointer to the first arg, or a null pointer if there are no args
  # and we have hit the end of the arg list.
  incq %rsi
  # Get the address that is offset 8 bytes * %rsi from the top of the stack (%rbp);
  # That is, get the pointer to the next arg (or a null pointer, terminating the args).
  movq (%rbp, %rsi, 8), %rdi
  # Is this a null pointer (is %rdi zero)?
  cmpq $0, %rdi
  # If it is a null pointer, we are done; go on to print env vars.
  je env_loop
  # If we get this far, %rdi has a pointer to the beginning of an arg.
  # A C-style func is allowed to clobber %rsi and %rdi, so push them both
  # onto the stack to save their current values for use later.
  pushq %rsi
  pushq %rdi
  movq $arg_label, %rdi
  call print_string
  # Get the saved value of %rdi (pointer to an arg) from the stack and put it back into %rdi
  # so that the next call to print_string prints the arg.
  popq %rdi
  call print_string
  movq $newline, %rdi
  call print_string
  # Now that our C-style function calls that have clobbered %rsi
  # are done, pop the saved value of %rsi back into %rsi.
  popq %rsi
  # Note that %rdi has likely been clobbered again, but we don't care because we are
  # going to overwrite it anyway at the top of this loop.
  jmp arg_loop

env_loop:
  # Increment %rsi so that when we can point one element deeper in the stack (the pointer to the next arg)
  # When we first enter this loop, %rsi would be the null pointer delineating the end of args,
  # so we would want to advance past that.
  incq %rsi
  # Get the address that is offset 8 bytes * %rsi from the top of the stack (%rsp);
  # That is, get the pointer to the next env var (or a null pointer, terminating the env vars).
  movq (%rbp, %rsi, 8), %rdi
  # Is this a null pointer (is %rdi zero)?
  cmpq $0, %rdi
  # If it is a null pointer, we are done; quit the program.
  je quit_program

  # If we get this far, %rdi has a pointer to the beginning of an env var.
  # A C-style func is allowed to clobber %rsi and %rdi, so push them both
  # onto the stack to save their current values for use later.
  pushq %rsi
  pushq %rdi
  movq $env_label, %rdi
  call print_string
  # Get the saved value of %rdi (pointer to an env var) from the stack and put it back into %rdi
  # so that the next call to print_string prints the env var.
  popq %rdi
  call print_string
  movq $newline, %rdi
  call print_string
  # Now that our C-style function calls that have clobbered %rsi
  # are done, pop the saved value of %rsi back into %rsi.
  popq %rsi
  # Note that %rdi has likely been clobbered again, but we don't care because we are
  # going to overwrite it anyway at the top of this loop.
  jmp env_loop

quit_program:
  movq $60, %rax # 60 is the exit syscall
  movq $0, %rdi  # 0 is the status number we will return, first arg to exit syscall
  syscall

.type strlen, @function
strlen:
  pushq %rbp            # save the old base pointer
  movq %rsp, %rbp       # make the stack pointer the base pointer

  # start the accumulator (and, usefully, our return register) at 0
  # We will accumulate the byte count in %rax.
  movq $0, %rax

  # zero out %rcx before we use it. We will be storing each byte we read
  # into %cl, the lowest byte of the 8-byte %rcx register.
  movq $0, %rcx

  # arg1 is rdi, which is the address of the first byte of the null-terminated string.
  movb (%rdi), %cl # load the first byte of the string into $cl

  cmpb $0, %cl
  je end_strlen

strlen_loop:
  # add one to the byte count
  incq %rax

  # move our pointer one byte ahead as well
  incq %rdi

  # load the pointed-to byte of the string into %cl
  movb (%rdi), %cl

  # check to see if the pointed-to byte is zero
  cmpb $0, %cl

  # if the byte was zero, jump to the end
  je end_strlen

  # if the byte was not zero, jump to the top of the loop again
  jmp strlen_loop

end_strlen:
  leave
  ret

.type print_string, @function
print_string:
  pushq %rbp            # save the old base pointer
  movq %rsp, %rbp       # make the stack pointer the base pointer

  # %rdi has the pointer to the first byte of our string.
  # Save it on stack just in case strlen modifies it.
  pushq %rdi

  # Get the length of the string we want to print.
  # conveniently, %rdi already has the address of the first byte
  # of the string we need to get the length of, so we can call strlen
  # without even having to set up %rdi again.
  call strlen

  # After the call to strlen, %rax has the length of the string.
  # It is the 4th arg to the write system call, going into %rsi.
  movq %rax, %rdx

  # After the call to strlen, %rdi may have been modified, so pop
  # its original value back off the stack but put it into %rsi, the third
  # argument to the write system call (pointer to first char of string).
  popq %rsi

  movq $1, %rax        # 1 is the write system call
  movq $1, %rdi        # 1 is the stdout filehandle, first arg to write syscall
  # %rsi                 already contains pointer to first char of string
  # %rdx                 already contains the length of the string we want to print
  syscall

  # return 0 from this function (could also consider returning bytes written)
  movq $0, %rax

  leave
  ret

Here's what a sample run looks like:

$ ./args_and_env foo bar baz
Program: ./args_and_env
Arg: foo
Arg: bar
Arg: baz
Env: CLUTTER_IM_MODULE=xim
Env: LS_COLORS=di=34:fi=0:ln=35:pi=31:so=31:bd=31:cd=31:or=41:mi=41:ex=32
Env: XDG_MENU_PREFIX=gnome-
Env: LANG=en_US.UTF-8
...