COFF - 中间文件格式解析

G Common Object File Format (COFF)

Overall structure 630

File header 632

Optional header 633

Section headers 634

Raw data sections 636

COFF relocation information 637

Line number information 639

Symbol table 641

Additional symbols 643

String table 643

This section describes the Common Object File Format,

本节描述了通用对象文件格式

COFF, used by the linker.

COFF文件, 提供给链接器连接成可执行文件的中间文件

Overall structure

整体的COFF文件结构体:

The COFF Object Format is used both for object files (.o extension) and executable files.

COFF目标格式既用于中间文件,也用于可执行文件

Some of the information is only present in object files,

一些信息只出现在对象文件中

other information is only present in the executable files.

其他的信息只出现在可执行文件中

Table G-1   COFF file components COFF文件组成


Section  区段名


Description  说明


File header

文件头


Contains general information; always present.

包含一般性的消息, 永远有效


Optional header

扩展头


Contains information about an executable file; usually only present in executables.

包含关于可执行文件的信息, 通常只出现在可执行文件中


Section header

区段头


Contains information about the different COFF sections; one for each section.

包含每个不同的COFF区段信息, 每个区段头对应每个区段


Raw data sections

原始数据区


One for each section containing raw data, such as machine instructions and initialized variables.

每个区段包含的数据, 例如可执行的机器码,和用来初始化变量的数据


Relocation information

重定位信息


Contains information about unresolved references to symbols in other modules;

包含来自其它文件中没有确定地址的符号的信息.

one for each section having external references.

每个区段都有一个外部符号

Usually only present in object files and not in executable files.

通常在目标文件出现而不在可执行文件中出现


Line number information

行号信息


Contains debugging information about source line numbers;

好汉源代码行号的调试信息

one for each section if compiled with the -g option.

如果编译选项含有-g参数,那么每一个区段都含有


Symbol table

符号表


Contains information about all the symbols in the object file;

包含目标文件的所有符号信息

present if not stripped from an executable file.

目标文件都含有, 可执行文件如果没有剔除的话也有


String table

字符串表


Contains long symbol names.

包含一些长过8字节的符号名

The following figure shows the COFF file structure:

下图显示的是COFF文件结构

File header 文件头

The file header contains general information about the object file

文件头包含目标文件的一般信息

and has the following structure from the file filehdr.h:

下面是来自filehdr.h文件的结构体

 1 struct filehdr {
 3     unsigned short  f_magic;    /* magic 魔术字 */
 5     unsigned short  f_nscns;    /* number of sections 区段个数*/
 7     long            f_timdat;   /* date stamp  时间戳*/
 9     long            f_symptr;   /* fileptr to symtab 符号表的文件偏移*/
11     long            f_nsyms;    /* symtab count 符号表个数*/
13     unsigned short  f_opthdr;   /* sizeof(optional hdr) 扩展头的大小*/
15     unsigned short  f_flags;    /* flags COFF文件属性*/
17 }; 

Table G-2   COFF header fields


Field


Description


f_magic


Magic number used to identify the file as a COFF file. It has the value 0x170 for the PowerPC family of processors.


f_nscns


Number of sections this file contains.

这个文件包含的区段个数


f_timdat


Creation time of the file represented as a 32 bit value.

一个32位的数,表示文件的生成时间


f_symptr


File offset of the symbol table.

符号表的文件偏移


f_nsyms


Number of entries in the symbol table.

符号表的条目数


f_opthdr


Number of bytes in the Optional Header.

扩展头的字节数


f_flags


Bit field containing the following flags:

这个是个位域,包含着以下信息


F_RELFLG (0x1)


Set if the COFF file does not contain relocation information;

如果设置,这个COFF文件就是不存在重定位信息

normally true only for executable files.

通常只有可执行文件为true(1)


F_EXEC (0x2)


Set if the file is executable and all references are resolved.

如果设置,则文件是一个所有符号引用都确定的可执行文件


F_LNNO (0x4)


Set if the COFF file does not contain line number information;

如果设置,则文件是一个没有行号信息的对象文件

this symbolic debugging information can be stripped with the -s option or the strip program.

这些调试符号信息可以被-s的参数或剔除程序给剔除


F_LSYMS (0x8)


Set if the COFF file does not contain local symbols;

如果设置该位,文件将没有本地符号

these symbols can be stripped with the -X and -x options to the assembler and linker.

可以用汇编器和链接器传入-X和-x参数剔除符号


F_AR32W (0x200)


如果设置该位,则为大端的字节序

Optional header

The optional header contains information about an executable file and has the following structure from the file aouthdr.h:

扩展头包含可执行文件的信息.下面是来自aouthdr.h头文件的结构体:

 1 typedef struct aouthdr {
 3     short   magic;              /* a.out magic */
 5     short   vstamp;             /* version stamp */
 7     long    tsize;              /* .text size */
 9     long    dsize;              /* .data size */
11     long    bsize;              /* .bss size */
13     long    entry;              /* entry point */
15     long    text_start;         /* fileptr to .text */
17     long    data_start;         /* fileptr to .data */
19 } AOUTHDR; 

Table G-3   COFF optional (executable) header fields


Field


Description


magic


Value 0x10b.


vstamp


Set by the option -VS, but not used by the linker.


tsize


Size of the .text section.


dsize


Size of the .data section.


bsize


Size of the .bss section.


entry


Entry point in the executable program where execution will begin. The default entry point is the symbol start defined in the file function main(). The -e option can change this to any other symbol in the program.


text_start


File offset to the .text section in the COFF file.


data_start


File offset to the .data section in the COFF file.


Section headers


区段头

There is one section header for each section in the COFF file,

每个COFF文件的区段都有像下面一样的区段头

specified by the f_nscns field in the COFF File Header.

由COFF的文件头结构体中的 f_nscns字段指出它的文件偏移

Section headers have the following structure from the file scnhdr.h:

下面是来自scnhdr.h头文件的区段头结构体

 1 struct scnhdr {                     /* modified COFF*/
 3     char            s_name[8];      /* section name 区段名*/
 5     long            s_paddr;        /* physical address 物理地址*/
 7     long            s_vaddr;        /* virtual address 虚拟地址*/
 9     long            s_size;         /* size of section 区段的字节数*/
11     long            s_scnptr;       /* fileptr to raw data 指向原始数据的文件偏移*/
13     long            s_relptr;       /* fileptr to reloc 指向重定位表的文件偏移*/
15     long            s_lnnoptr;      /* fileptr to lineno 指向行号表的文件偏移*/
17     unsigned long short  s_nreloc;       /* reloc count 重定位表条目数*/
19     unsigned long short  s_nlnno;        /* line number count 行号表条目数*/
21     long            s_flags;        /* flags */
23 };

#define SCNHDR struct scnhdr

#define SCNHSZ sizeof(SCNHDR)

Table G-4   COFF section header fields


Field


Description


s_name[8]


Eight byte null terminated section name.

8个字节, 以NULL为结束符的区段名

Standard names include .text, .data, and .bss.

标准的区段名包含:.text, .data, and .bss.


s_paddr


Physical start address of the section.

区段的物理起始地址.

It is usually set to the same value as s_vaddr,

它通常被设置为s_vaddr设相同的值

but can be set to a different value with the command in the linker command language.

但是在链接器语言中可以设置不同的值

This can be useful when initialized data is physically allocated to a ROM address,

当给一个ROM分配一个实际地址以初始化数据时,它是有用的

but moved to a logical address in RAM at start-up.

但在启动后将被亦作一个虚拟地址


s_vaddr


Logical start address of the section as allocated by the assembler or linker.

区段被汇编器和链接器分配的虚拟开始地址


s_size


Size in bytes of the memory allocated to the section.

区段被分配的内存的字节数


s_scnptr


File offset to the raw data of the section.

区段原始数据的文件偏移

Note that the .注意,

bss section does not have any raw data since it will be initialized by the operating system.

bss部分没有任何原始数据,因为它将由操作系统初始化


s_relptr


File offset to the relocation information of the section.

区段重定位数据的文件偏移


s_lnnoopt


File offset to the line number information of the section.

区段行号表信息的文件偏移


s_nreloc


Number of relocation information entries.

重定位数据的数目


s_nlnno


Number of line number information entries.

行号表数据的数目


s_flags


Bit field containing the following flags:

该位域包含以下信息


STYP_TEXT (0x20)


set for a .text section.

被设置时,这是一个代码段


STYP_DATA (0x40)


set for a .data section.

被设置时,这是一个数据段


STYP_BSS (0x80)


set for .bss section.

被设置时,这是一个未初始化的数据段


STYP_INFO (0x200)


set for a .comment section.

The following table shows the correspondence between the type-spec as defined on p.409 and the COFF section flags assigned to the output section.

Table G-5   type-spec - COFF section flag correspondence


type-spec


Section flags (s_flags)


BSS


STYP_BSS


COMMENT


STYP_INFO


CONST


STYP_DATA


DATA


STYP_DATA


TEXT


STYP_TEXT

Raw data sections 原始数据区段

The Raw Data Sections contain the actual raw data for each section.

原始数据区段包含每个区段的实际的原始数据

Table G-6   COFF section names


.text


Machine instructions, constant data, and strings

可执行的机器码, 常量数据和常量字符串


.sdata2


Small constant data; see the Set size limit for "small const" variables (-Xsmall-const=n), p.106.


.data


Initialized data.  用于初始化全局变量的数据


.sdata


Small initialized data; see the Set size limit for "small data" variables (-Xsmall-data=n), p.106.


.bss


Uninitialized data; does not have any raw data.

未初始化的数据, 不存在任何原始数据


.sbss


Small uninitialized data.


.comment


Comments from #ident directives in C.


.init


Code that is to be executed before the main() function.

在main()函数之前被执行的代码


.fini


Code that is to be executed when the user program has finished execution.

当用户程序执行完毕后被执行的代码


.eini


The instructions of the .fini code;

.fini区段的指令

the .init, .fini, and .eini sections should be placed after each other in memory.

当彼此都在内存之后 .init, .fini, and .eini区段应该被分配

COFF relocation information

The Relocation Information segment contains information about unresolved references.

重定位段包含外部未分配地址的符号.

Since compilers and assemblers do not know at what absolute memory address a symbol will be allocated,

当汇编器和编译器不知道怎么给一个符号分配绝对的内存地址时.

and since they are unaware of definitions of symbols in other files,

因为汇编器和编译器不知道该符号会在其他文件定义

every reference to such a symbol will create a relocation entry.

所有这样的符号引用将被创建一个重定位条目

The relocation entry will point to the address where the reference is being made,

这个重定位条目将会指向这个符号被引用的地址.

and to the symbol table entry that contains the symbol that is referenced.

所以,当一个符号是被引用的,它会被包含在符号表的条目中.

The linker will use this information to fill in the correct address after it has allocated addresses to all symbols.

链接器可以使用这些信息给所有符号分配地址后,纠正这些被引用符号的地址

When an offset is added to a symbol in the assembly source,

当添加一个汇编源码中的符号时

lwz     r3,(var+16)(r0)

move.l  var+16,d0

that offset is stored in the addressing mode,

这个偏移到的地方是没有寻址模式的

so that adding the real address of the symbol with the address field will yield a correct reference.

这样添加的符号的真正地址的字段将产生一个正确的参考。

The relocation segment does not exist in executable files.

重定位段在可执行文件中不存在

A relocation entry has the following structure from the file reloc.h:

 1 struct reloc {                  /* modified COFF */
 3     long            r_vaddr;    /* 引用的地址(文件偏移) */
 5     long            r_symndx;   /* 在符号表的索引(符号名) */
 7     unsigned short  r_type;     /* 重定位类型 */
 9     unsigned short  r_offset;   /* 高位的字是真实地址*/
11 };
12
13
14
15 #define RELOC   struct reloc
16
17 #define RELSZ   sizeof(RELOC)
18
19 #define RELSZ   10              /* sizeof(RELOC) */ 

Table G-7   COFF relocation entry fields


Field


Description


r_vaddr


The relative address of the area within the current section to be patched with the correct address.

修正的地址是被匹配到的当前区段头的相对地址 , 这是指向需要修正的地址,这个值是个段内偏移,以这个段的开始的地方为偏移.


r_symndx


Index into the symbol table pointing to the entry describing the symbol that is referenced at r_vaddr.

r_vaddr. 地址对应的符号, 该数值是符号表条目一个索引.


r_type


Type of addressing mode used;

使用寻址模式的类型

it describes whether the mode is absolute or relative,

它描述是绝对寻址还是相对寻址

and the size of the addressing mode.

和寻址模式的字节数

See the table below for relocation types used by the Wind River tools.

通过风河公司工具查看下面的重定位类型的使用


r_offset


The high 16 bits of any offset that is added to the symbol in a R_HVRT16, R_LVRT16, and R_HAVRT16 relocation modes.

当一个符号的r_type是 R_HVRT16, R_LVRT16, and R_HAVRT16 中的一个类型时,r_offset的高16位

Since the address field in the instruction is only 16 bits, it cannot represent a large offset. Example:

addis r13,r0,(var+0x123456)@ha.

The address field in the addis instruction will contain 0x3456 and r_offset will contain 0x12.

Table G-8   COFF relocation types


Relocation type


Number


Description


R_RELWORD


16


16 bit absolute address:

lwz    r3,var(r0)


R_HVRT16


131


Higher 16 bits of an absolute address:

addis  r3,r0,[email protected]


R_LVRT16


132


Lower 16 bits of an absolute address:

lwz    r3,[email protected](r0)


R_HAVRT16


136


Adjusted higher 16 bits of an absolute address. If the lower 16 bits is a negative number, one is added to the upper 16 bits:

addis  r3,r0,[email protected]


R_PCR16S2


137


16 bit PC relative address where the lower two bits are ignored:

bc     4,2,label


R_PCR26S2


138


26 bit PC relative address where the lower two bits are ignored:

bl     func


R_REL16S2


139


16 bit absolute address where the lower two bits are ignored:

bca    4,2,label


R_REL26S2


140


26 bit absolute address where the lower two bits are ignored:

bla    func

Line number information

The line number information segment contains the mapping from source line numbers to machine instruction addresses used by symbolic debuggers. This information is only available if the -g option is specified to the compiler.

Line number entries for a section form groups of pairs where the first pair in a group is a pointer to the function containing the source. After that, every source line that has generated any instruction has an entry specifying the line number relative to the beginning of the function, and the corresponding instruction address. Normally only the .text section has line number information. The following table demonstrates the layout of the line number entries:

A line number entry has the following structure from the file linenum.h:

 1 struct lineno {
 3     union {
 5         long        l_symndx;
 7         long        l_paddr;
 9     } l_addr;
11     unsigned long short  l_lnno;
13 };
14
15
16
17 #define LINENO      struct lineno
18
19 #define LINESZ      sizeof(LINENO)
20
21 #define LINESZ      6
22
23 Table G-9   COFF line number fields 

Field


Description


l_symndx


Symbol table index for a new function; only valid if l_lnno is set to zero.


l_paddr


Instruction address corresponding to the source line l_lnno.


l_lnno


Source line relative to the start of the current function.

Symbol table

The symbol table is an array of entries containing information about the symbols referenced in the COFF file.

符号表是一个包含COFF文件的符号引用的一个数组.

A symbol table entry has the following structure from the file syms.h:

 1 struct syment {
 3     union {
 5         char        _n_name[8];
 7         struct {
 9             long    _n_zeroes;
11             long    _n_offset;
13         } _n_n;
15         char        *_n_nptr[2]
17     } _n;
19     long            n_value;
21     short           n_scnum;
25     unsigned short  n_type;
27     char            n_sclass;
29     char            n_numaux;
31     short           n_pad;
33 };
34
35
36
37 #define SYMENT      struct syment
38
39 #define SYMESZ c    20
40
41 #define SYMESZ      18
42
43 #define n_name      _n._n_name
44
45 #define n_nptr      _n._n_nptr[1]
46
47 #define n_zeroes    _n._n_n._n_zeroes
48
49 #define n_offset    _n._n_n._n_offset 

Table G-10   COFF symbol table fields


Field


Description


n_name


Name of the symbol if the length is less than or equal to 8 bytes.

如果长度小于等于8则是符号的符号名,

If it is less than 8 bytes the name is terminated by a null character0.

如果符号名称小于8字节,则该符号名称一NULL字符结尾


n_zeroes


Zero if a symbol name is longer than 8 bytes.

如果符号名的长度大于8.则这个位段的值为0

This field overlaps the first 4 bytes of n_name.

这个位段是和 n_name位段的首4个字节重叠的


n_offset


An offset into the String Table if n_zeroes is zero.

如果n_zeroes位段是0,则n_offset是字符串表的一个偏移(以字符串表开始地址为偏移)


n_value


This pointer allows for overlays.


n_value


A value whose contents depends on the symbol type.

这个值取决于符号的类型

Normally it contains the address or the size of the symbol if the symbol is a common block.

这个位段一般保存符号所在区段的段内偏移地址,或者一个普通类型的块的占用的空间的大小

A zero value indicates an undefined symbol if n_scnum is also zero.

如果nvalue位段和n_scnum位段的值都是0,那么这个符号是一个未定义的符号


n_scnum


Section number of the symbol starting with one.

符号所在区段的索引(区段头作为一个数组,该索引就是这个数组的索引)

A zero value indicates one of two things:

当它的值为0时,说明有以下两种情况:

If n_value is zero then the symbol is an undefined symbol that must be defined in another file.

当n_value的值也是0时,这个符号是一个未定义的符号,这个符号必须在其他的文件中被定义

If n_value is not zero then the symbol is a common block of size n_value.

当n_value的值不是0时,这n_value的值是普通数据块占用的空间的大小

All common blocks with the same name are combined by the linker and put in the.bss section,

具有相同名称的所有常见的块由链接器组合到可执行文件中

unless some other file defines that symbol in a section.


n_type


Type of the symbol; only set if compiled with -g.

表示符号的类型,  仅当编译器使用了-g参数时才有值


n_sclass


Storage class of the symbol. There are over 20 storage classes, but most are used only

with the -g compiler option.

符号的村粗类型.有超过20中存储类型, 但大多数仅仅在编译器用-g参数生成中间的文件.

The two classes of interest to the linker are C_EXT, external storage, and C_STAT, static (local to the file) storage.

链接器一般只对2种存储类型感兴趣,它们是:C_EXT,外部存储类型和 C_STAT静态存储类型


n_numaux


Number of auxiliary entries used by the symbol.

符号使用的辅助条目数


n_pad


Pad the structure to a multiple of four bytes.

没有意义的字段,只是为了让结构体4字节对齐.

Any auxiliary entries to a symbol are stored immediately after the symbol in the table. They are mainly used for symbolic debugging (-g option) and are not discussed here.

Additional symbols

Wind River uses special COFF symbols as follows:

Table G-11   Special COFF Symbols


Extension


Description


!sn!section-name


Long section-name.


!cd!name


COMDAT-section-name. See Mark sections as COMDAT for linker collapse (-Xcomdat), p.71.


!sf!flags


Section flags (a: allocate, w: write, x: execute, b: bss/nocode).


!al!value


Section alignment.


!wk!symbol-name


Weak symbol. See weak pragma, p.138.

String table

The string table contains the null terminated names of symbols longer than eight characters. Those symbols point into the string table through an offset, n_offset. The first four bytes of the string table contain the size of the table and after that all strings are stored sequentially.

[email protected]

Copyright © 2002, Wind River Systems, Inc. All rights reserved.

时间: 2024-10-25 18:18:27

COFF - 中间文件格式解析的相关文章

C++PE文件格式解析类(轻松制作自己的PE文件解析器)

PE是Portable Executable File Format(可移植的运行体)简写,它是眼下Windows平台上的主流可运行文件格式. PE文件里包括的内容非常多,详细我就不在这解释了,有兴趣的能够參看之后列出的參考资料及其它相关内容. 近期我也在学习PE文件格式,參考了很多资料.用C++封装了一个高效方便的PE文件格式解析的类. 该类对想学PE文件结构的朋友可算一份可贵的资料.代码均非常易懂,考虑较全面,具有一定的通用性. 同一时候该类也能够让想创建自己的PE文件解析软件的朋能够轻松在

CSV文件格式解析器的实现:从字符串Split到FSM

本文分为5小节,基本上就是我刚接触CSV文件到思考.实践做一个CSV解析器的过程的还原.希望我的思路也能带领你一步步从浅到深认识CSV文件格式. 1.简单的CSV解析器实现. 2.简单实现的CSV解析器的问题 3. CSV格式的定义 4.用FSM(有限状态机)来做CSV格式解析. 5.为什么使用CSV格式 1.简单的CSV解析器实现. 最近有一个需求,读取CSV格式的配置.CSV是CommaSeparated Value(逗号分隔值)的缩写,通常用文本表示数据.CSV格式数据的结构类似表格,不同

ArcGIS三大文件格式解析

原文 ArcGIS三大文件格式解析 Shape数据 Shapefile是ArcView GIS 3.x的原生数据格式,属于简单要素类,用点.线.多边形存储要素的形状,却不能存储拓扑关系,具有简单.快速显示的优点.一个shapefile是由若干个文件组成的,空间信息和属性信息分离存储,所以称之为“基于文件”. 每个shapefile,都至少有这三个文件组成,其中: *.shp 存储的是几何要素的的空间信息,也就是XY坐标 *.shx 存储的是有关*.shp存储的索引信息.它记录了在*.shp中,空

FLV文件格式解析

FLV文件格式解析 媒体格式分析之flv -- 基于FFMPEG

(转)AVI文件格式解析+AVI文件解析工具

AVI文件解析工具下载地址:http://download.csdn.net/detail/zjq634359531/7556659 AVI(Audio Video Interleaved的缩写)是一种RIFF(Resource Interchange File Format的缩写)文件格式,多用于音视频捕捉.编辑.回放等应用程序中.通常情况下,一个AVI文件可以包含多个不同类型的媒体流(典型的情况下有一个音频流和一个视频流),不过含有单一音频流或单一视频流的AVI文件也是合法的.AVI可以算是

解析prototxt文件的python库 prototxt-parser(使用parsy自定义文件格式解析)

解析prototxt文件的python库 prototxt-parser https://github.com/yogin16/prototxt_parser https://test.pypi.org/project/prototxt-parser1.yield让函数执行支持分段,让函数支持了记忆和状态,能够让一个函数变成状态机,这样一个状态机的执行流程可能直接表达在一个函数中,让整个处理流程更加顺畅.2.parsy的optional,Returns a parser that expects

PDF文件格式解析(1)- 了解PDF的语法格式

PDF文件格式解析(1)- 了解PDF的语法格式 PDF格式 由Adobe Systems Incorporated开发的PDF(便携式文档格式)被Adobe描述为一种通用的文档表示语言.PDF代表格式化的,面向页面的文档.这些文档可以是结构化的或简单的.它们可能包含文本,图像,图形和其他多媒体内容,例如视频和音频.支持注释,元数据,超文本链接和书签.更高版本提供了其他功能,例如,将地理空间信息嵌入到代表地图或其他地理空间图像(例如卫星照片)的文档中. PDF的核心是源自PostScript页面

windows obj文件格式解析

1.首先vs 2013建立工程生成obj文件,如下图. 2.打开CMD命令行模式,用工具dumpbin执行以下命令对test.obj进行解析. dumpbin /all test.obj > test.txt obj解析信息会保存在test.txt文件中. obj文件它的格式其实就是COFF(通用对象文件格式)文件格式. 先来看一下COFF文件的整体结构,看看它到底长得什么样! File Header Optional Header Section Header 1 ...... Section

IP流量重放与pcap文件格式解析

(作者:燕云   出处:http://www.cnblogs.com/SwordTao/ 欢迎转载,但也请保留这段声明,谢谢!)   君不见 黄河之水 天上来 奔流到海不复回   君不见 高堂明镜 悲白发 朝如青丝暮成雪   人生得意须尽欢 莫使金樽空对月 --将进酒 pcap文件格式,为多数的tcpdump.wireshark等重量级的数据包抓取.分析应用程序所直接支持,所以,为我们的程序中嵌入此类文件的解析与生成功能,很是值得. 具体信息请看wireshark wiki:http://wik