file命令使用介绍
file最常用的场景就是用来查看可执行文件的运行环境,是arm呢,还是x86呢,还是mips呢?一看便知
$ file a.out
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0xa240b1958136fc294a6ee5833de2a0fc8c9e0bd4, not stripped
高亮部分依次是:操作系统位数, 大小端(LSB小端), 文件类型(executable可执行文件, Relocatable可重定位文件, Shared object动态库文件), 指令集类型(x86-64, Intel 80386, mips, ARM), 是否去除符号表(发布版一般都会去除以增加反汇编难度,加强安全性)
objdump使用及测试分析(x86-64位ubuntu)
linux下的ELF文件(可执行文件,动态库文件,可重定位文件,静态库文件)结构:
- ELF文件头
$ readelf -h a.out
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2‘s complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048320
Start of program headers: 52 (bytes into file)
Start of section headers: 4960 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 36
Section header string table index: 33
- SHT(section head table),ELF包含的section的一张映射表
$ readelf -S a.out
There are 36 section headers, starting at offset 0x1360:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 08048154 000154 000013 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 08048168 000168 000020 00 A 0 0 4
[ 3] .note.gnu.build-i NOTE 08048188 000188 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 080481ac 0001ac 000020 04 A 5 0 4
[ 5] .dynsym DYNSYM 080481cc 0001cc 000050 10 A 6 1 4
[ 6] .dynstr STRTAB 0804821c 00021c 00004a 00 A 0 0 1
[ 7] .gnu.version VERSYM 08048266 000266 00000a 02 A 5 0 2
[ 8] .gnu.version_r VERNEED 08048270 000270 000020 00 A 6 1 4
[ 9] .rel.dyn REL 08048290 000290 000008 08 A 5 0 4
[10] .rel.plt REL 08048298 000298 000018 08 A 5 12 4
[11] .init PROGBITS 080482b0 0002b0 00002e 00 AX 0 0 4
[12] .plt PROGBITS 080482e0 0002e0 000040 04 AX 0 0 16
[13] .text PROGBITS 08048320 000320 00017c 00 AX 0 0 16
[14] .fini PROGBITS 0804849c 00049c 00001a 00 AX 0 0 4
[15] .rodata PROGBITS 080484b8 0004b8 000014 00 A 0 0 4
[16] .eh_frame_hdr PROGBITS 080484cc 0004cc 000034 00 A 0 0 4
[17] .eh_frame PROGBITS 08048500 000500 0000c4 00 A 0 0 4
[18] .ctors PROGBITS 08049f14 000f14 000008 00 WA 0 0 4
[19] .dtors PROGBITS 08049f1c 000f1c 000008 00 WA 0 0 4
[20] .jcr PROGBITS 08049f24 000f24 000004 00 WA 0 0 4
[21] .dynamic DYNAMIC 08049f28 000f28 0000c8 08 WA 6 0 4
[22] .got PROGBITS 08049ff0 000ff0 000004 04 WA 0 0 4
[23] .got.plt PROGBITS 08049ff4 000ff4 000018 04 WA 0 0 4
[24] .data PROGBITS 0804a00c 00100c 000008 00 WA 0 0 4
[25] .bss NOBITS 0804a014 001014 000008 00 WA 0 0 4
[26] .comment PROGBITS 00000000 001014 00002a 01 MS 0 0 1
[27] .debug_aranges PROGBITS 00000000 00103e 000020 00 0 0 1
[28] .debug_info PROGBITS 00000000 00105e 00008b 00 0 0 1
[29] .debug_abbrev PROGBITS 00000000 0010e9 00003f 00 0 0 1
[30] .debug_line PROGBITS 00000000 001128 000038 00 0 0 1
[31] .debug_str PROGBITS 00000000 001160 00007e 01 MS 0 0 1
[32] .debug_loc PROGBITS 00000000 0011de 000038 00 0 0 1
[33] .shstrtab STRTAB 00000000 001216 000147 00 0 0 1
[34] .symtab SYMTAB 00000000 001900 000470 10 35 51 4
[35] .strtab STRTAB 00000000 001d70 0001fb 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
常用到的几个section解释:
1) .text section 里装载了可执行代码;
2) .data section 里面装载了被初始化的数据;
3) .bss section 里面装载了未被初始化的数据;
4) 以 .rec 打头的 sections 里面装载了重定位条目;
5) .symtab 或者 .dynsym section 里面装载了符号信息;
6) .strtab 或者 .dynstr section 里面装载了字符串信息;
Flg为A表示进程需要的,会被分配到内存的section, 另外一些没有A的,可以通过strip去掉
- section(在ELF文件里头,用以装载内容数据的最小容器)
用objdump对可执行文件的代码段(sections .text)进行反汇编:
$ objdump -d -j .text a.out
a.out: file format elf32-i386
Disassembly of section .text:
08048320 <_start>:
8048320: 31 ed xor %ebp,%ebp
8048322: 5e pop %esi
8048323: 89 e1 mov %esp,%ecx
8048325: 83 e4 f0 and $0xfffffff0,%esp
8048328: 50 push %eax
8048329: 54 push %esp
804832a: 52 push %edx
804832b: 68 60 84 04 08 push $0x8048460
8048330: 68 f0 83 04 08 push $0x80483f0
8048335: 51 push %ecx
8048336: 56 push %esi
8048337: 68 d4 83 04 08 push $0x80483d4
804833c: e8 cf ff ff ff call 8048310 <[email protected]>
8048341: f4 hlt
8048342: 90 nop
8048343: 90 nop
8048344: 90 nop
8048345: 90 nop
8048346: 90 nop
8048347: 90 nop
8048348: 90 nop
8048349: 90 nop
804834a: 90 nop
804834b: 90 nop
804834c: 90 nop
804834d: 90 nop
804834e: 90 nop
804834f: 90 nop
08048350 <__do_global_dtors_aux>:
8048350: 55 push %ebp
8048351: 89 e5 mov %esp,%ebp
8048353: 53 push %ebx
8048354: 83 ec 04 sub $0x4,%esp
8048357: 80 3d 14 a0 04 08 00 cmpb $0x0,0x804a014
804835e: 75 3f jne 804839f <__do_global_dtors_aux+0x4f>
8048360: a1 18 a0 04 08 mov 0x804a018,%eax
8048365: bb 20 9f 04 08 mov $0x8049f20,%ebx
804836a: 81 eb 1c 9f 04 08 sub $0x8049f1c,%ebx
8048370: c1 fb 02 sar $0x2,%ebx
8048373: 83 eb 01 sub $0x1,%ebx
8048376: 39 d8 cmp %ebx,%eax
8048378: 73 1e jae 8048398 <__do_global_dtors_aux+0x48>
804837a: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
8048380: 83 c0 01 add $0x1,%eax
8048383: a3 18 a0 04 08 mov %eax,0x804a018
8048388: ff 14 85 1c 9f 04 08 call *0x8049f1c(,%eax,4)
804838f: a1 18 a0 04 08 mov 0x804a018,%eax
8048394: 39 d8 cmp %ebx,%eax
8048396: 72 e8 jb 8048380 <__do_global_dtors_aux+0x30>
8048398: c6 05 14 a0 04 08 01 movb $0x1,0x804a014
804839f: 83 c4 04 add $0x4,%esp
80483a2: 5b pop %ebx
80483a3: 5d pop %ebp
80483a4: c3 ret
80483a5: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
80483a9: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi
080483b0 <frame_dummy>:
80483b0: 55 push %ebp
80483b1: 89 e5 mov %esp,%ebp
80483b3: 83 ec 18 sub $0x18,%esp
80483b6: a1 24 9f 04 08 mov 0x8049f24,%eax
80483bb: 85 c0 test %eax,%eax
80483bd: 74 12 je 80483d1 <frame_dummy+0x21>
80483bf: b8 00 00 00 00 mov $0x0,%eax
80483c4: 85 c0 test %eax,%eax
80483c6: 74 09 je 80483d1 <frame_dummy+0x21>
80483c8: c7 04 24 24 9f 04 08 movl $0x8049f24,(%esp)
80483cf: ff d0 call *%eax
80483d1: c9 leave
80483d2: c3 ret
80483d3: 90 nop
080483d4 <main>:
80483d4: 55 push %ebp
80483d5: 89 e5 mov %esp,%ebp
80483d7: 83 e4 f0 and $0xfffffff0,%esp
80483da: 83 ec 10 sub $0x10,%esp
80483dd: c7 04 24 c0 84 04 08 movl $0x80484c0,(%esp)
80483e4: e8 07 ff ff ff call 80482f0 <[email protected]>
80483e9: b8 00 00 00 00 mov $0x0,%eax
80483ee: c9 leave
80483ef: c3 ret
080483f0 <__libc_csu_init>:
80483f0: 55 push %ebp
80483f1: 57 push %edi
80483f2: 56 push %esi
80483f3: 53 push %ebx
80483f4: e8 69 00 00 00 call 8048462 <__i686.get_pc_thunk.bx>
80483f9: 81 c3 fb 1b 00 00 add $0x1bfb,%ebx
80483ff: 83 ec 1c sub $0x1c,%esp
8048402: 8b 6c 24 30 mov 0x30(%esp),%ebp
8048406: 8d bb 20 ff ff ff lea -0xe0(%ebx),%edi
804840c: e8 9f fe ff ff call 80482b0 <_init>
8048411: 8d 83 20 ff ff ff lea -0xe0(%ebx),%eax
8048417: 29 c7 sub %eax,%edi
8048419: c1 ff 02 sar $0x2,%edi
804841c: 85 ff test %edi,%edi
804841e: 74 29 je 8048449 <__libc_csu_init+0x59>
8048420: 31 f6 xor %esi,%esi
8048422: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
8048428: 8b 44 24 38 mov 0x38(%esp),%eax
804842c: 89 2c 24 mov %ebp,(%esp)
804842f: 89 44 24 08 mov %eax,0x8(%esp)
8048433: 8b 44 24 34 mov 0x34(%esp),%eax
8048437: 89 44 24 04 mov %eax,0x4(%esp)
804843b: ff 94 b3 20 ff ff ff call *-0xe0(%ebx,%esi,4)
8048442: 83 c6 01 add $0x1,%esi
8048445: 39 fe cmp %edi,%esi
8048447: 75 df jne 8048428 <__libc_csu_init+0x38>
8048449: 83 c4 1c add $0x1c,%esp
804844c: 5b pop %ebx
804844d: 5e pop %esi
804844e: 5f pop %edi
804844f: 5d pop %ebp
8048450: c3 ret
8048451: eb 0d jmp 8048460 <__libc_csu_fini>
8048453: 90 nop
8048454: 90 nop
8048455: 90 nop
8048456: 90 nop
8048457: 90 nop
8048458: 90 nop
8048459: 90 nop
804845a: 90 nop
804845b: 90 nop
804845c: 90 nop
804845d: 90 nop
804845e: 90 nop
804845f: 90 nop
08048460 <__libc_csu_fini>:
8048460: f3 c3 repz ret
08048462 <__i686.get_pc_thunk.bx>:
8048462: 8b 1c 24 mov (%esp),%ebx
8048465: c3 ret
8048466: 90 nop
8048467: 90 nop
8048468: 90 nop
8048469: 90 nop
804846a: 90 nop
804846b: 90 nop
804846c: 90 nop
804846d: 90 nop
804846e: 90 nop
804846f: 90 nop
08048470 <__do_global_ctors_aux>:
8048470: 55 push %ebp
8048471: 89 e5 mov %esp,%ebp
8048473: 53 push %ebx
8048474: 83 ec 04 sub $0x4,%esp
8048477: a1 14 9f 04 08 mov 0x8049f14,%eax
804847c: 83 f8 ff cmp $0xffffffff,%eax
804847f: 74 13 je 8048494 <__do_global_ctors_aux+0x24>
8048481: bb 14 9f 04 08 mov $0x8049f14,%ebx
8048486: 66 90 xchg %ax,%ax
8048488: 83 eb 04 sub $0x4,%ebx
804848b: ff d0 call *%eax
804848d: 8b 03 mov (%ebx),%eax
804848f: 83 f8 ff cmp $0xffffffff,%eax
8048492: 75 f4 jne 8048488 <__do_global_ctors_aux+0x18>
8048494: 83 c4 04 add $0x4,%esp
8048497: 5b pop %ebx
8048498: 5d pop %ebp
8048499: c3 ret
804849a: 90 nop
804849b: 90 nop
GCC生成的HELLO WORLD汇编语言分析
- 用c语言写一个hello world程序main1.c
#include <stdio.h>
#include <stdlib.h>
int main()
{
printf("hello world\n");
return 0;
}
- 生成汇编代码
gcc -o1 -S main1.c
- 打开汇编文件
.file "main1.c"
.section .rodata #.rodata用来保存只读数据的地方, 字串符"hello world"就是放在这里
.LC0: #标签, 标签名可以修改
.string "hello world"
.text
.globl main
.type main, @function #定义函数
main:
.LFB0:
.cfi_startproc #函数开始标示
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $.LC0, %edi #将字符串‘hello world‘放入edi寄存器,作为系统调用的参数
call puts #调用系统函数
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc #函数结束标示
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section .note.GNU-stack,"",@progbits
- objdump 反汇编的代码
00000000004004f4 <main>:
4004f4: 55 push %rbp
4004f5: 48 89 e5 mov %rsp,%rbp
4004f8: bf fc 05 40 00 mov $0x4005fc,%edi #0x4005fc字串地址是什么内容?
4004fd: e8 ee fe ff ff callq 4003f0 <[email protected]> #4003f0这是系统函数的地址,后面括号里就是对应的函数名
400502: b8 00 00 00 00 mov $0x0,%eax
400507: 5d pop %rbp
400508: c3 retq
400509: 90 nop
40050a: 90 nop
40050b: 90 nop
40050c: 90 nop
40050d: 90 nop
40050e: 90 nop
40050f: 90 nop
$0x4005fc这个字符串地址是什么内容?
从前的汇编文件可以看到,字符串是保存在.rodata这个section里的,对执行文件用objdump可以看到.rodata的内容,如果看不到汇编文件,那就只能靠猜了
$ objdump -d -j .rodata a.out
a.out: file format elf64-x86-64
Disassembly of section .rodata:
00000000004005f8 <_IO_stdin_used>:
4005f8: 01 00 02 00 68 65 6c 6c 6f 20 77 6f 72 6c 64 00 ....hello world.
手写的hello world 汇编程序
这是一本讲AT&A的书里的例子,用来入门学习一下( Linux汇编AT&A 汇 编.pdf),生成的可执行文件真的比gcc生成的小很多(gcc: 7K, asm: 352byte)
- 编写hello.s汇编文件
#hello.s
.data # 数据段声明
msg : .string "Hello, world!\n" # 要输出的字符串
len = . - msg # 字串长度
.text # 代码段声明
.global _start # 指定入口函数
_start: # 在屏幕上显示一个字符串
movl $len, %edx # 参数三:字符串长度
movl $msg, %ecx # 参数二:要显示的字符串
movl $1, %ebx # 参数一:文件描述符(stdout)
movl $4, %eax # 系统调用号(sys_write)
int $0x80 # 调用中断,进入内核调用
# 退出程序
movl $0,%ebx # 参数一:退出代码
movl $1,%eax # 系统调用号(sys_exit)
int $0x80 # 调用内核功能
- 通过编译器as编译成可重定向文件hello.o
$ as hello.s -o hello.o
- 通过链接器ld链成可执行文件hello
$ ld hello.o -o hello
- 生成可执行文件后,再用objdump反编译看一下是什么样子
$ objdump -d hello
hello: file format elf32-i386
Disassembly of section .text:
08048074 <_start>:
8048074: ba 0f 00 00 00 mov $0xf,%edx
8048079: b9 98 90 04 08 mov $0x8049098,%ecx
804807e: bb 01 00 00 00 mov $0x1,%ebx
8048083: b8 04 00 00 00 mov $0x4,%eax
8048088: cd 80 int $0x80
804808a: bb 00 00 00 00 mov $0x0,%ebx
804808f: b8 01 00 00 00 mov $0x1,%eax
8048094: cd 80 int $0x80
#在执行 int 80 指令时,寄存器 eax 中存放的是 系统调用的功能号,而传给系统调用的参数则必须按顺序放到寄存器 ebx,ecx,edx,esi,edi 中,当系统调用完成之
后,返回值可以在寄存器 eax 中获得
AT&T汇编学习笔记(一)