[转]Resolving kernel symbols

原:http://ho.ax/posts/2012/02/resolving-kernel-symbols/

KXLD doesn’t like us much. He has KPIs to meet and doesn’t have time to help out shifty rootkit developers. KPIs are Kernel Programming Interfaces - lists of symbols in the kernel that KXLD (the kernel extension linker) will allow kexts to be linked against. The KPIs on which your kext depends are specified in the?Info.plist?file like this:

<key>OSBundleLibraries</key>
<dict>
	<key>com.apple.kpi.bsd</key>
	<string>11.0</string>
	<key>com.apple.kpi.libkern</key>
	<string>11.0</string>
	<key>com.apple.kpi.mach</key>
	<string>11.0</string>
	<key>com.apple.kpi.unsupported</key>
	<string>11.0</string>
	<key>com.apple.kpi.iokit</key>
	<string>11.0</string>
	<key>com.apple.kpi.dsep</key>
	<string>11.0</string>
</dict>

Those bundle identifiers correspond to the?CFBundleIdentifier?key specified in the?Info.plistfiles for “plug-ins” to the?System.kext?kernel extension. Each KPI has its own plug-in kext - for example, the?com.apple.kpi.bsd?symbol table lives in?BSDKernel.kext. These aren’t exactly complete kexts, they’re just Mach-O binaries with symbol tables full of undefined symbols (they really reside within the kernel image), which you can see if we dump the load commands:

$ otool -l /System/Library/Extensions/System.kext/PlugIns/BSDKernel.kext/BSDKernel
/System/Library/Extensions/System.kext/PlugIns/BSDKernel.kext/BSDKernel:
Load command 0
     cmd LC_SYMTAB
 cmdsize 24
  symoff 80
   nsyms 830
  stroff 13360
 strsize 13324
Load command 1
     cmd LC_UUID
 cmdsize 24
    uuid B171D4B0-AC45-47FC-8098-5B2F89B474E6

That’s it - just the?LC_SYMTAB?(symbol table). So, how many symbols are there in the kernel image?

$ nm /mach_kernel|wc -l
   16122

Surely all the symbols in all the KPI symbol tables add up to the same number, right?

$ find /System/Library/Extensions/System.kext/PlugIns -type f|grep -v plist|xargs nm|sort|uniq|wc -l
    7677

Nope. Apple doesn’t want us to play with a whole bunch of their toys. 8445 of them. Some of them are pretty fun too :( Like?allproc:

$ nm /mach_kernel|grep allproc
ffffff80008d9e40 S _allproc
$ find /System/Library/Extensions/System.kext/PlugIns -type f|grep -v plist|xargs nm|sort|uniq|grep allproc
$

Damn. The?allproc?symbol is the head of the kernel’s list (the?queue(3)?kind of list) of running processes. It’s what gets queried when you run?ps(1)?or?top(1). Why do we want to find?allproc? If we want to hide processes in a kernel rootkit that’s the best place to start. So, what happens if we build a kernel extension that imports?allproc?and try to load it?

bash-3.2# kextload AllProcRocks.kext
/Users/admin/AllProcRocks.kext failed to load - (libkern/kext) link error; check the system/kernel logs for errors or try kextutil(8).

Console says:

25/02/12 6:30:47.000 PM kernel: kxld[ax.ho.kext.AllProcRocks]: The following symbols are unresolved for this kext:
25/02/12 6:30:47.000 PM kernel: kxld[ax.ho.kext.AllProcRocks]: 	_allproc

OK, whatever.

What do we do?

There are a few steps that we need to take in order to resolve symbols in the kernel (or any other Mach-O binary):

  • Find the?__LINKEDIT?segment - this contains an array of?struct nlist_64’s which represent all the symbols in the symbol table, and an array of symbol name strings.
  • Find the?LC_SYMTAB?load command - this contains the offsets within the file of the symbol and string tables.
  • Calculate the position of the string table within?__LINKEDIT?based on the offsets in theLC_SYMTAB?load command.
  • Iterate through the?struct nlist_64’s in?__LINKEDIT, comparing the corresponding string in the string table to the name of the symbol we’re looking for until we find it (or reach the end of the symbol table).
  • Grab the address of the symbol from the?struct nlist_64?we’ve found.

Parse the load commands

One easy way to look at the symbol table would be to read the kernel file on disk at?/mach_kernel, but we can do better than that if we’re already in the kernel - the kernel image is loaded into memory at a known address. If we have a look at the load commands for the kernel binary:

$ otool -l /mach_kernel
/mach_kernel:
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 472
  segname __TEXT
   vmaddr 0xffffff8000200000
   vmsize 0x000000000052f000
  fileoff 0
 filesize 5435392
  maxprot 0x00000007
 initprot 0x00000005
   nsects 5
    flags 0x0
<snip>

We can see that the?vmaddr?field of the first segment is?0xffffff8000200000. If we fire up GDB and point it at a VM running Mac OS X (as per my previous posts?here?and?here), we can see the start of the Mach-O header in memory at this address:

gdb$ x/xw 0xffffff8000200000
0xffffff8000200000:	0xfeedfacf

0xfeedfacf?is the magic number denoting a 64-bit Mach-O image (the 32-bit version is?0xfeedface). We can actually display this as a struct if we’re using the DEBUG kernel with all the DWARF info:

gdb$ print *(struct mach_header_64 *)0xffffff8000200000
$1 = {
  magic = 0xfeedfacf,
  cputype = 0x1000007,
  cpusubtype = 0x3,
  filetype = 0x2,
  ncmds = 0x12,
  sizeofcmds = 0x1010,
  flags = 0x1,
  reserved = 0x0
}

The?mach_header?and?mach_header_64?structs (along with the other Mach-O-related structs mentioned in this post) are documented in the?Mach-O File Format Reference, but we aren’t particularly interested in the header at the moment. I recommend having a look at the kernel image with?MachOView?to get the gist of where everything is and how it’s laid out.

Directly following the Mach-O header is the first load command:

gdb$ set $mh=(struct mach_header_64 *)0xffffff8000200000
gdb$ print *(struct load_command*)((void *)$mh + sizeof(struct mach_header_64))
$6 = {
  cmd = 0x19,
  cmdsize = 0x1d8
}

This is the load command for the first?__TEXT?segment we saw with?otool. We can cast it as asegment_command_64?in GDB and have a look:

gdb$ set $lc=((void *)$mh + sizeof(struct mach_header_64))
gdb$ print *(struct segment_command_64 *)$lc
$7 = {
  cmd = 0x19,
  cmdsize = 0x1d8,
  segname = "__TEXT\000\000\000\000\000\000\000\000\000",
  vmaddr = 0xffffff8000200000,
  vmsize = 0x8c8000,
  fileoff = 0x0,
  filesize = 0x8c8000,
  maxprot = 0x7,
  initprot = 0x5,
  nsects = 0x5,
  flags = 0x0
}

This isn’t the load command we are looking for, so we have to iterate through all of them until we come across a segment with?cmd?of?0x19?(LC_SEGMENT_64) and?segname?of?__LINKEDIT. In the debug kernel, this happens to be located at?0xffffff8000200e68:

gdb$ set $lc=0xffffff8000200e68
gdb$ print *(struct load_command*)$lc
$14 = {
  cmd = 0x19,
  cmdsize = 0x48
}
gdb$ print *(struct segment_command_64*)$lc
$16 = {
  cmd = 0x19,
  cmdsize = 0x48,
  segname = "__LINKEDIT\000\000\000\000\000",
  vmaddr = 0xffffff8000d08000,
  vmsize = 0x109468,
  fileoff = 0xaf4698,
  filesize = 0x109468,
  maxprot = 0x7,
  initprot = 0x1,
  nsects = 0x0,
  flags = 0x0
}

Then we grab the?vmaddr?field from the load command, which specifies the address at which the__LINKEDIT?segment’s data will be located:

gdb$ set $linkedit=((struct segment_command_64*)$lc)->vmaddr
gdb$ print $linkedit
$19 = 0xffffff8000d08000
gdb$ print *(struct nlist_64 *)$linkedit
$20 = {
  n_un = {
    n_strx = 0x68a29
  },
  n_type = 0xe,
  n_sect = 0x1,
  n_desc = 0x0,
  n_value = 0xffffff800020a870
}

And there’s the first?struct nlist_64.

As for the?LC_SYMTAB?load command, we just need to iterate through the load commands until we find one with the?cmd?field value of?0x02?(LC_SYMTAB). In this case, it’s located at?0xffffff8000200eb0:

gdb$ set $symtab=*(struct symtab_command*)0xffffff8000200eb0
gdb$ print $symtab
$23 = {
  cmd = 0x2,
  cmdsize = 0x18,
  symoff = 0xaf4698,
  nsyms = 0x699d,
  stroff = 0xb5e068,
  strsize = 0x9fa98
}

The useful parts here are the?symoff?field, which specifies the offset in the file to the symbol table (start of the?__LINKEDIT?segment), and the?stroff?field, which specifies the offset in the file to the string table (somewhere in the middle of the?__LINKEDIT?segment). Why, you ask, did we need to find the?__LINKEDIT?segment as well, since we have the offset here in the?LC_SYMTAB?command? If we were looking at the file on disk we wouldn’t have needed to, but as the kernel image we’re inspecting has already been loaded into memory, the binary segments have been loaded at the virtual memory addresses specified in their load commands. This means that the?symoff?and?stroff?fields are not correct any more. However, they’re still useful, as the difference between the two helps us figure out the offset into the?__LINKEDIT?segment at which the string table exists:

gdb$ print $linkedit
$24 = 0xffffff8000d08000
gdb$ print $linkedit + ($symtab->stroff - $symtab->symoff)
$25 = 0xffffff8000d719d0
gdb$ set $strtab=$linkedit + ($symtab->stroff - $symtab->symoff)
gdb$ x/16s $strtab
0xffffff8000d719d0:	 ""
0xffffff8000d719d1:	 ""
0xffffff8000d719d2:	 ""
0xffffff8000d719d3:	 ""
0xffffff8000d719d4:	 ".constructors_used"
0xffffff8000d719e7:	 ".destructors_used"
0xffffff8000d719f9:	 "_AddFileExtent"
0xffffff8000d71a08:	 "_AllocateNode"
0xffffff8000d71a16:	 "_Assert"
0xffffff8000d71a1e:	 "_BF_decrypt"
0xffffff8000d71a2a:	 "_BF_encrypt"
0xffffff8000d71a36:	 "_BF_set_key"
0xffffff8000d71a42:	 "_BTClosePath"
0xffffff8000d71a4f:	 "_BTDeleteRecord"
0xffffff8000d71a5f:	 "_BTFlushPath"
0xffffff8000d71a6c:	 "_BTGetInformation"

Actually finding some symbols

Now that we know where the symbol table and string table live, we can get on to the srs bznz. So, let’s find that damn?_allproc?symbol we need. Have a look at that first?struct nlist_64?again:

gdb$ print *(struct nlist_64 *)$linkedit
$28 = {
  n_un = {
    n_strx = 0x68a29
  },
  n_type = 0xe,
  n_sect = 0x1,
  n_desc = 0x0,
  n_value = 0xffffff800020a870
}

The?n_un.nstrx?field there specifies the offset into the string table at which the string corresponding to this symbol exists. If we add that offset to the address at which the string table starts, we’ll see the symbol name:

gdb$ x/s $strtab + ((struct nlist_64 *)$linkedit)->n_un.n_strx
0xffffff8000dda3f9:	 "_ps_vnode_trim_init"

Now all we need to do is iterate through all the?struct nlist_64’s until we find the one with the matching name. In this case it’s at?0xffffff8000d482a0:

gdb$ set $nlist=0xffffff8000d482a0
gdb$ print *(struct nlist_64*)$nlist
$31 = {
  n_un = {
    n_strx = 0x35a07
  },
  n_type = 0xf,
  n_sect = 0xb,
  n_desc = 0x0,
  n_value = 0xffffff8000cb5ca0
}
gdb$ x/s $strtab + ((struct nlist_64 *)$nlist)->n_un.n_strx
0xffffff8000da73d7:	 "_allproc"

The?n_value?field there (0xffffff8000cb5ca0) is the virtual memory address at which the symbol’s data/code exists.?_allproc?is not a great example as it’s a piece of data, rather than a function, so let’s try it with a function:

gdb$ set $nlist=0xffffff8000d618f0
gdb$ print *(struct nlist_64*)$nlist
$32 = {
  n_un = {
    n_strx = 0x52ed3
  },
  n_type = 0xf,
  n_sect = 0x1,
  n_desc = 0x0,
  n_value = 0xffffff80007cceb0
}
gdb$ x/s $strtab + ((struct nlist_64 *)$nlist)->n_un.n_strx
0xffffff8000dc48a3:	 "_proc_lock"

If we disassemble a few instructions at that address:

gdb$ x/12i 0xffffff80007cceb0
0xffffff80007cceb0 <proc_lock>:	push   rbp
0xffffff80007cceb1 <proc_lock+1>:	mov    rbp,rsp
0xffffff80007cceb4 <proc_lock+4>:	sub    rsp,0x10
0xffffff80007cceb8 <proc_lock+8>:	mov    QWORD PTR [rbp-0x8],rdi
0xffffff80007ccebc <proc_lock+12>:	mov    rax,QWORD PTR [rbp-0x8]
0xffffff80007ccec0 <proc_lock+16>:	mov    rcx,0x50
0xffffff80007cceca <proc_lock+26>:	add    rax,rcx
0xffffff80007ccecd <proc_lock+29>:	mov    rdi,rax
0xffffff80007cced0 <proc_lock+32>:	call   0xffffff800035d270 <lck_mtx_lock>
0xffffff80007cced5 <proc_lock+37>:	add    rsp,0x10
0xffffff80007cced9 <proc_lock+41>:	pop    rbp
0xffffff80007cceda <proc_lock+42>:	ret

We can see that GDB has resolved the symbol for us, and we’re right on the money.

Sample code

I’ve posted an example kernel extension on?github?to check out. When we load it with?kextload KernelResolver.kext, we should see something like this on the console:

25/02/12 8:06:49.000 PM kernel: [+] _allproc @ 0xffffff8000cb5ca0
25/02/12 8:06:49.000 PM kernel: [+] _proc_lock @ 0xffffff80007cceb0
25/02/12 8:06:49.000 PM kernel: [+] _kauth_cred_setuidgid @ 0xffffff80007abbb0
25/02/12 8:06:49.000 PM kernel: [+] __ZN6OSKext13loadFromMkextEjPcjPS0_Pj @ 0xffffff80008f8606

Update: It was brought to my attention that I was using a debug kernel in these examples. Just to be clear - the method described in this post, as well as the sample code, works on a non-debug, default install >=10.7.0 (xnu-1699.22.73) kernel as well, but the GDB inspection probably won’t (unless you load up the struct definitions etc, as they are all stored in the DEBUG kernel). The debug kernel contains every symbol from the source, whereas many symbols are stripped from the distribution kernel (e.g.?sLoadedKexts). Previously (before 10.7), the kernel would write out the symbol table to a file on disk and jettison it from memory altogether. I suppose when kernel extensions were loaded,kextd?or?kextload?would resolve symbols from within that on-disk symbol table or from the on-disk kernel image. These days the symbol table memory is just marked as pageable, so it can potentially get paged out if the system is short of memory.

I hope somebody finds this useful. Shoot me an email or get at me on twitter if you have any questions. I’ll probably sort out comments for this blog at some point, but I cbf at the moment.

时间: 2024-08-30 05:22:53

[转]Resolving kernel symbols的相关文章

Linux Kernel sys_call_table、Kernel Symbols Export Table Generation Principle、Difference Between System Calls Entrance In 32bit、64bit Linux(undone)

目录 1. sys_call_table:系统调用表 2. 内核符号导出表.kallsyms_lookup_name 3. Linux 32bit.64bit下系统调用入口的异同 1. sys_call_table:系统调用表 Relevant Link: 2. 内核符号导出表.kallsyms_lookup_name Relevant Link: 3. Linux 32bit.64bit下系统调用入口的异同 以sys_execve.sys_socketcall.sys_init_module这

kernel panic

Linux kernel panic是很难定位和排查的重大故障,一旦系统发生了kernel panic,相关的日志信息非常少,而一种常见的排查方法-重现法–又很难实现,因此遇到kernel panic的问题,一般比较头疼.没有一个万能和完美的方法来解决所有的kernel panic问题,这篇文章仅仅只是给出一些思路,一来如何解决kernel panic的问题,二来可以尽可能减少发生kernel panic的机会.什么是kernel panic 就像名字所暗示的那样,它表示Linux kernel

kernel(一)编译体验

目录 打补丁 配置 总结 配置方式 配置体验 配置详解 Makefile解析 子目录的Makefile 架构下面的Makefile 顶层Makefile Make解析 编译 链接 链接脚本 烧写内核 title: kernel(一)编译体验 tags: linux date: 2018-11-06 17:27:22 --- 打补丁 解压 tar xjf linux-2.6.22.6.tar.bz2 打补丁,cat下补丁文件知道需要忽略第一个/ patch -p1 < linux-2.6.22.6

Root exploit for Android and Linux(CVE-2010-4258)

/* 本文章由 莫灰灰 编写,转载请注明出处. 作者:莫灰灰    邮箱: [email protected] */ 一. 漏洞简介 CVE-2010-4258这个漏洞很有意思,主要思路是如果通过clone函数去创建进程,并且带有CLONE_CHILD_CLEARTID标志,那么进程在退出的时候,可以造成内核任意地址写0的bug.PoC代码利用了多个漏洞来达到权限提升的目的. 二. 前置知识 (进程创建.退出) 1.当fork或者clone一个进程在的时候, copy_process执行如下操作

展讯sc7731 LCD驱动简明笔记之三

此篇笔记基于sc7731 - android 5.1,对lcd的gralloc库做一个简明笔记. 第一部分 调用gralloc.sc8830.so所谓的Gralloc模块,它就是一个模块,一个操作kernel层framebuffer驱动的动态库模块,它属于大名鼎鼎的HAL层.用的时候就加载到内存空间,不用的时候就从内存空间中卸载掉.下面看下系统如何将该模块加载到内存空间的.在Android系统中,所有访问HAL层模块的应用,都需要通过一个叫 hw_get_module() 的方法去获得需要的HA

【转】CVE-2010-4258 漏洞分析

一. 漏洞简介 CVE-2010-4258这个漏洞很有意思,主要思路是如果通过clone函数去创建进程,并且带有CLONE_CHILD_CLEARTID标志,那么进程在退出的时候,可以造成内核任意地址写0的bug.PoC代码利用了多个漏洞来达到权限提升的目的. 二. 前置知识 (进程创建.退出) 1.当fork或者clone一个进程在的时候, copy_process执行如下操作: [cpp] view plaincopy static struct task_struct *copy_proc

Android的Framework分析---4硬件抽象HAL

大家都知道android是基于linux的kernel上的.android可以 运行在intel,高通,nvidia等硬件平台.但是涉及到一些GPU,显卡和一些设备的驱动问题,因为这些驱动都不是开源的,google位了兼容这些设备厂商的驱动源码,提出了硬件抽象层HAL的概念.HAL层对上为framework和native开发提供统一的API接口,为下层驱动的代码提供统一的调用接口.本文主要讲解HAL是如何实现的. 1.HAL的数据结构 HAL的通用写法里面有两个重要的结构体: 1.1 hw_mo

Linux Overflow Vulnerability General Hardened Defense Technology

Catalog 1. Grsecurity/PaX 2. Hardened toolchain 3. Default addition of the Stack Smashing Protector (SSP): Compiler Flag: GS 4. Automatic generation of Position Independent Executables (PIEs): System Characteristic + Compiler Flag: ASLR 5. Default to

android从应用到驱动之—camera(2)---cameraHAL的实现

本来想用这一篇博客把cameraHAL的实现和流程都给写完的.搞了半天,东西实在是太多了.这篇先写cameraHAL的基本实现框架,下一篇在具体写camerahal的流程吧. cameraHAL的实现: 对于初学者来说,最大的疑问是系统是如何调用hardware的.这里就以camera来举例说明.调用hardware的程序是cameraservice,我们就去它里面看看它是如何找到hardware的 先把源码贴上来: /* ** ** Copyright (C) 2008, The Androi