As we all know, function is a important concept in programming design. At this moment, I even
don‘t know what kind of programming language can working without function. ( maybe i am new).
maybe some special language can ? But this concept is very necessary indeed. Now, the question
is : How can we realize this concept in assemble language/machine code ?
1. Overview
Three issues will be explain by this example..
a). How can we call a function?
b). How can we build a stack frame for local variable ?
c). How can we pass parameters between caller and callee ?
#include <stdio.h> /* * This is a empty function, it does nothing. we build it for show how can we call a function. */ void call() { } /* * Explain how can we build a stack frame. */ void frame() { int b; } /* * Explain something about pass parameters and return result. */ int parameters( int a, int b, int c) { <pre class="plain" name="code"> int sum; sum = a + b + c; return sum; } int main() { int ret; call( ); frame( ); ret = parameters(1,2,3); return 0; }
For discuss the issues, we need to translate it to a low level language, assemble language. In this example, I use a linux compiler--gcc. It will help us to get a assemble code. (Actually, there have a problem in here--different compiler
may be use different convention, even use different Application Binary Interface, but they still have some common features.) The corresponding new code is :
...... call: pushl %ebp movl %esp, %ebp popl %ebp ret ...... frame: pushl %ebp movl %esp, %ebp subl $16, %esp leave ret ...... parameters: pushl %ebp movl %esp, %ebp subl $16, %esp movl 12(%ebp), %eax addl 8(%ebp), %eax addl 16(%ebp), %eax movl %eax, -4(%ebp) movl -4(%ebp), %eax leave ret ...... main: leal 4(%esp), %ecx andl $ -16, %esp pushl -4(%ecx) pushl %ebp movl %esp, %ebp pushl %ecx subl $28, %esp call call call frame movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) call parameters movl %eax, -8(%ebp) movl $0, %eax addl $28, %esp popl %ecx popl %ebp leal -4(%ecx), %esp ret ......
(Be careful, Here is AT&T syntax.)
2. How can we call a function?
From the view of machine, call a function is equal to change the instruction stream. It
seems like simple. Actually, there are another problem, How can we return to the instruction
stream of the caller ? A valid way is save the instruction pointer before jump to the callee.
Now, Let us see this example:
int main() { ... call( ); ... } void call() { }
This is simple function call, how can we realize it by assemble language ? examine the
corresponding code.
main: ...... call call ......
In @main function, it call a function @call by a assemble instruction--call. This instruction
does two things needed to be done. one, save the current value of register @IP in stack.
Two, revise the value of @IP to the address of caller.
pushl %IP;
movl call, %IP;
call: .... ret
when we complete this subroutine, the next step is to return to the previous instruction
stream. The current status is
.... <-- %EBP for caller
....
0xeeee0000 <-- return address
<-- %ESP for caller
So, we just need to pop the data from stack.
pop %IP;
3. How can we build a stack frame for local variable ?
For local variable, there is a important feature that we need--reentrant. we want to local variable
can be independent in every function call, even call a recursive function. So we dynamically create
independent memory space for every function call, this is called --stack frame. Now , examine the code.
int main() { ... frame( ); ... } void frame() { int b; }
before we call @frame, all of thing is same with the example above. The current stack is
base address for main <-- %EBP .... top address of stack of main <-- %ESP
when we call this @frame, the new stack is
base address for main <-- %EBP .... return address of caller top address of stack of main <-- %ESP
then let us examine the progress of callee, the assemble is :
frame: pushl %ebp movl %esp, %ebp subl $16, %esp leave ret
As we can see, It will save the frame information of caller. and then create a new frame. When we execute the first command:
pushl %ebp
the stack frame is
bottom of stack of main <-- %EBP .... return address of caller base address of frame of caller top of stack of main <-- %ESP
when we execute the second instruction:
movl %ESP, %EBP;
the new stack frame is:
stack bottom of main .... return address of caller base address of frame of caller stack top of main <-- %EBP stack top of callee <-- %ESP
Actually, the stack top of caller is the stack bottom of callee. So far,we didn‘t allocate memory space for local variable of this function. So the stack bottom of callee and stack top of callee is same temporarily. But in the next instruction, we will
allocate space:
subl $16, %esp
As we can see, the memory space allocated is 16 bytes because of some reason about memory alignment and the like. Actually, we just use the first 4 bytes. The new stack is :
stack bottom of main .... return address of caller base address of frame of caller stack top of main <-- %EBP local variable b .... stack top of callee <-- %ESP
So far, we have been build a valid stack frame for this new function call.
The next question is how can we resume the frame of caller when we complete this subroutine?
That is easy . Recall the frame above, we just need :
movl %EBP, %ESP; pop %EBP;
Actually, there is another more simpler instruction--leave. It will complete those two steps.
Now, the stack is :
stack bottom of main <-- %EBP .... return address of caller stack top of main <-- %ESP
It is seems like all of things become OK.
4. How can we pass parameters between caller and callee ?
Usually, Pass parameters is necessary when we call a function. Where should be the place we reside
those data ? Let us see the example below, we call a function with several parameters.
int main() { int ret; ... ret = parameters(1,2,3); ... } int parameters( int a, int b, int c) { int sum; sum = a + b + c; return sum; }
The corresponding assemble code is :
main: ... movl $3, 8(%esp) movl $2, 4(%esp) movl $1, (%esp) call parameters movl %eax, -8(%ebp) ...
parameters: pushl %ebp movl %esp, %ebp subl $16, %esp movl 12(%ebp), %eax addl 8(%ebp), %eax addl 16(%ebp), %eax movl %eax, -4(%ebp) movl -4(%ebp), %eax leave ret
Now, examine those instructions.Before we call this function, the stack is :
stack bottom of main <-- %EBP ... stack top of main <-- %ESP
Then , we push three parameters in reverse order:
stack bottom of main <-- %EBP ... 3 <-- 3th parameter 2 1 stack top of main <-- %ESP
Then,
call parameters
This is same as we analysis above. we jump to the new instruction stream, and build a new stack frame.
stack bottom of main ... 3 <-- 3th parameter 2 1 return address of caller base address of frame of caller stack top of main <-- %EBP ... stack top of callee <-- %ESP
In subroutine, if we need parameters, we can simply get it.
x86的ABI分析(函数实现原理)--part2