x86的ABI分析(函数实现原理)--part2

As we all know, function is a important concept in programming design. At this moment, I even

don‘t know what kind of programming language can working without function. ( maybe i am new).

maybe some special language can ? But this concept is very necessary indeed. Now, the question

is : How can we realize this concept in assemble language/machine code ?

1. Overview

Three issues will be explain by this example..

a). How can we call a function?

b). How can we build a stack frame for local variable ?

c). How can we pass parameters between caller and callee ?

#include <stdio.h>
/*
*    This is a empty function, it does nothing. we build it for show how can we call a function.
*/
void call()
{ }
/*
*    Explain how can we build a stack frame.
*/
void frame()
{
        int b;
}
/*
*    Explain something about pass parameters and return result.
*/
int parameters( int a, int b, int c)
{
<pre class="plain" name="code">        int sum;
        sum = a + b + c;
        return sum;
}

int main()
{
        int ret;
        call( );
        frame( );
        ret = parameters(1,2,3);
        return 0;
}

For discuss the issues, we need to translate it to a low level language, assemble language.     In this example, I use a linux compiler--gcc. It will help us to get a assemble code. (Actually,     there have a problem in here--different compiler
may be use different convention, even use      different Application Binary Interface, but they still have some common features.)         The corresponding new code is :

        ......
        call:
                pushl  %ebp
                movl  %esp, %ebp
                popl %ebp
                ret
        ......
        frame:
                pushl  %ebp
                movl  %esp, %ebp
                subl   $16, %esp
                leave
                ret
                ......
        parameters:
                pushl  %ebp
                movl  %esp, %ebp
                subl  $16, %esp
                movl  12(%ebp), %eax
                addl  8(%ebp), %eax
                addl  16(%ebp), %eax
                movl  %eax, -4(%ebp)
                movl  -4(%ebp), %eax
                leave
                ret
        ......
        main:
                leal 4(%esp), %ecx
                andl $ -16, %esp
                pushl  -4(%ecx)
                pushl  %ebp
                movl  %esp, %ebp
                pushl  %ecx
                subl  $28, %esp
                call  call
                call  frame
                movl  $3, 8(%esp)
                movl  $2, 4(%esp)
                movl  $1, (%esp)
                call  parameters
                movl  %eax, -8(%ebp)
                movl  $0, %eax
                addl  $28, %esp
                popl  %ecx
                popl  %ebp
                leal  -4(%ecx), %esp
                ret
        ......

(Be careful, Here is AT&T syntax.)

2. How can we call a function?

From the view of machine, call a function is equal to change the instruction stream. It

seems like simple. Actually, there are another problem, How can we return to the instruction

stream of the caller ? A valid way is save the instruction pointer before jump to the callee.

Now, Let us see this example:

        int main()
        {
                ...
                call( );
                ...
        }
        void call()
        { }

This is simple function call, how can we realize it by assemble language ? examine the

corresponding code.

        main:
                ......
                call  call
                ......

In @main function, it call a function @call by a assemble instruction--call. This instruction

does two things needed to be done. one, save the current value of register @IP in stack.

Two, revise the value of @IP to the address of caller.

pushl %IP;

movl  call, %IP;

        call:
                ....
                ret

when we complete this subroutine, the next step is to return to the previous instruction

stream. The current status is

....              <-- %EBP for caller

....

0xeeee0000 <-- return address

<-- %ESP for caller

So, we just need to pop the data from stack.

pop %IP;

3. How can we build a stack frame for local variable ?

For local variable, there is a important feature that we need--reentrant. we want to local variable

can be independent in every function call, even call a recursive function. So we dynamically create

independent memory space for every function call, this is called --stack frame. Now , examine the code.

        int main()
        {
            ...
            frame( );
            ...
        }
        void frame()
        {
            int b;
        }

before we call @frame, all of thing is same with the example above. The current stack is

        base address for main         <-- %EBP
        ....
        top address of stack of main  <-- %ESP

when we call this @frame, the new stack is

        base address for main         <-- %EBP
        ....
        return address of caller
        top address of stack of main  <-- %ESP

then let us examine the progress of callee, the assemble is :

        frame:
               pushl  %ebp
               movl  %esp, %ebp
               subl  $16, %esp
               leave
               ret

As we can see, It will save the frame information of caller. and then create a new frame. When we execute the first command:

               pushl  %ebp

the stack frame is

        bottom of stack of main         <-- %EBP
        ....
        return address of caller
        base address of frame of caller
        top of stack of main            <-- %ESP

when we execute the second instruction:

              movl %ESP, %EBP;

the new stack frame is:

        stack bottom of main
        ....
        return address of caller
        base address of frame of caller
        stack top of main                <-- %EBP

        stack top of callee              <-- %ESP

Actually, the stack top of caller is the stack bottom of callee. So far,we didn‘t allocate memory space for local variable of this function. So the stack bottom of callee and stack top of callee is same temporarily. But in the next instruction, we will
allocate space:

               subl  $16, %esp

As we can see, the memory space allocated is 16 bytes because of some reason about memory alignment and the like. Actually, we just use the first 4 bytes. The new stack is :

        stack bottom of main
        ....
        return address of caller
        base address of frame of caller
        stack top of main                <-- %EBP
        local variable b
        ....
        stack top of callee              <-- %ESP

So far, we have been build a valid stack frame for this new function call.

The next question is how can we resume the frame of caller when we complete this subroutine?

That is easy . Recall the frame above, we just need :

            movl %EBP, %ESP;
            pop %EBP;

Actually, there is another more simpler instruction--leave. It will complete those two steps.

Now, the stack is :

        stack bottom of main             <-- %EBP
        ....
        return address of caller
        stack top of main                <-- %ESP

It is seems like all of things become OK.

4. How can we pass parameters between caller and callee ?

Usually, Pass parameters is necessary when we call a function. Where should be the place we reside

those data ? Let us see the example below, we call a function with several parameters.

        int main()
        {
            int ret;
            ...
             ret = parameters(1,2,3);
             ...
        }
        int parameters( int a, int b, int c)
        {
            int sum;
            sum = a + b + c;
            return sum;
        }

The corresponding assemble code is :

        main:
            ...
            movl $3, 8(%esp)
            movl $2, 4(%esp)
            movl $1, (%esp)
            call parameters
            movl %eax, -8(%ebp)
            ...
       parameters:
            pushl %ebp
            movl  %esp, %ebp
            subl  $16, %esp
            movl  12(%ebp), %eax
            addl  8(%ebp), %eax
            addl  16(%ebp), %eax
            movl  %eax, -4(%ebp)
            movl  -4(%ebp), %eax
            leave
            ret

Now, examine those instructions.Before we call this function, the stack is :

            stack bottom of main  <-- %EBP
            ...
            stack top of main     <-- %ESP

Then , we push three parameters in reverse order:

            stack bottom of main  <-- %EBP
            ...
            3                     <-- 3th parameter
            2
            1
            stack top of main     <-- %ESP

Then,

         call parameters            

This is same as we analysis above. we jump to the new instruction stream, and build a new stack frame.

            stack bottom of main
            ...
            3                        <-- 3th parameter
            2
            1
            return address of caller
            base address of frame of caller
            stack top of main        <-- %EBP
            ...
            stack top of callee      <-- %ESP

In subroutine, if we need parameters, we can simply get it.

x86的ABI分析(函数实现原理)--part2

时间: 2024-10-29 08:03:36

x86的ABI分析(函数实现原理)--part2的相关文章

x86的ABI(C函数实现原理)分析

This article is aim to explain how to use assemble language realize those common function in C. But I fail to get a simple method to introduce it because of some reasons . I will try to extract some key point at this article. Then analysize a example

【转载】Select函数实现原理分析

Select函数实现原理分析 <原文> select需要驱动程序的支持,驱动程序实现fops内的poll函数.select通过每个设备文件对应的poll函数提供的信息判断当前是否有资源可用(如可读或写),如果有的话则返回可用资源的文件描述符个数,没有的话则睡眠,等待有资源变为可用时再被唤醒继续执行. 下面我们分两个过程来分析select: 1. select的睡眠过程 支持阻塞操作的设备驱动通常会实现一组自身的等待队列如读/写等待队列用于支持上层(用户层)所需的BLOCK(阻塞)或NONBLO

C语言可变参数函数实现原理

一.可变参数函数实现原理 C函数调用的栈结构: 可变参数函数的实现与函数调用的栈结构密切相关,正常情况下C的函数参数入栈规则为__stdcall, 它是从右到左的,即函数中的最右边的参数最先入栈. 本文地址:http://www.cnblogs.com/archimedes/p/variable-parameter.html,转载请注明源地址. 例如,对于函数: void fun(int a, int b, int c) { int d; ... } 其栈结构为 0x1ffc-->d 0x200

[转]易语言消息机制分析(消息拦截原理)

标 题: [原创]易语言消息机制分析(消息拦截原理)作 者: 红绡枫叶时 间: 2014-12-17,12:41:44链 接: http://bbs.pediy.com/showthread.php?t=195626 我自己做了个易语言的sig签名,方便分析的时候用.易语言例子是静态编译的.版本 5.11易语言其实是基于mfc的,它依然需要mfc的消息派发机制,只不过,自己当了系统与用户间的代理人.所有的消息都要经它转发而已.我在MFC的消息派发函数_AfxDispatchCmdMsg下断点,总

Kafka源码分析及图解原理之Producer端

一.前言 任何消息队列都是万变不离其宗都是3部分,消息生产者(Producer).消息消费者(Consumer)和服务载体(在Kafka中用Broker指代).那么本篇主要讲解Producer端,会有适当的图解帮助理解底层原理. 一.开发应用 首先介绍一下开发应用,如何构建一个KafkaProducer及使用,还有一些重要参数的简介. 1.1 一个栗子 1 /** 2 * Kafka Producer Demo实例类. 3 * 4 * @author GrimMjx 5 */ 6 public

Android 4.4 KitKat NotificationManagerService使用详解与原理分析(二)__原理分析

前置文章: <Android 4.4 KitKat NotificationManagerService使用详解与原理分析(一)__使用详解> 转载请务必注明出处:http://blog.csdn.net/yihongyuelan 概况 在上一篇文章<Android 4.4 KitKat NotificationManagerService使用详解与原理分析(一)__使用详解>中详细介绍了NotificationListenerService的使用方法,以及在使用过程中遇到的问题和

ARP协议-攻击与欺骗分析-交换机工作原理--(精华全篇版)

ARP协议攻击与欺骗分析-交换机工作原理 一:交换机的工作原理 (一):实验拓扑 1:主机A与主机B通信 交换机的工作原理 (1):主机A将一个带有主机A的的ip地址和MAC地址的数据进行封装成帧,向局域网中发送广播. (2):交换机1的接口G0/0/1接收到主机A发送来的数据帧,会立即将数据帧的源MA地 址和接收数据帧的接口G0/0/1记录在它的MAC地址表中.此时的源MAC地址是主机A. (3):交换机这个时候查看自己MAC地址表,看看自己的MAC地址表中是否记录了数据帧的目标MAC地址和默

Java线程池使用和分析(二) - execute()原理

相关文章目录: Java线程池使用和分析(一) Java线程池使用和分析(二) - execute()原理 execute()是 java.util.concurrent.Executor接口中唯一的方法,JDK注释中的描述是“在未来的某一时刻执行命令command”,即向线程池中提交任务,在未来某个时刻执行,提交的任务必须实现Runnable接口,该提交方式不能获取返回值.下面是对execute()方法内部原理的分析,分析前先简单介绍线程池有哪些状态,在一系列执行过程中涉及线程池状态相关的判断

C语言free函数的原理——————————【Badboy】

今天在网上看到了这样一个问题,"如果malloc 了一块字符串的内存,然后,它改变了这个字符串的大小,问会不会有一部分内存没有被释放掉."这个问题,以前的确没有仔细想过. 当然,我觉得是肯定会释放掉的,但是一直没有了解过free 的原理,不敢乱说.我看了一下操作系统的内存管理,基本上是这样的,当然各个系统的实现不一样. 操作系统管理内存,维护了一个空闲内存链表,malloc从个链表中选出一个来使用,每个内存块都有一个头部来表示这个内存的基本信息,如内存大小, 所以free 时候 能够记