[转] Dangers of using dlsym() with RTLD_NEXT

There are times when you want to wrap a library function in order to provide some additional functionality. A common example of this is wrapping the standard library’s malloc() and free() so that you can easily track memory allocations in your program. While there are several techniques for wrapping library functions, one well-known method is using dlsym() with RTLD_NEXT to locate the wrapped function’s address so that you can correctly forward calls to it.


Problem

So what can go wrong? Let’s look at an example:

LibWrap.h

void* memAlloc(size_t s);
// Allocate a memory block of size ‘s‘ bytes.
void memDel(void* p);
// Free the block of memory pointed to by ‘p‘.

LibWrap.c

#define _GNU_SOURCE
#include <dlfcn.h>
#include "LibWrap.h"

static void* malloc(size_t s) {
   // Wrapper for standard library‘s ‘malloc‘.
   // The ‘static‘ keyword forces all calls to malloc() in this file to resolve
   // to this functions.
   void* (*origMalloc)(size_t) = dlsym(RTLD_NEXT,"malloc");
   return origMalloc(s);
}

static void free(void* p) {
   // Wrapper for standard library‘s ‘free‘.
   // The ‘static‘ keyword forces all calls to free() in this file to resolve
   // to this functions.
   void (*origFree)(void*) = dlsym(RTLD_NEXT,"free");
   origFree(p);
}

void* memAlloc(size_t s) {
   return malloc(s);
   // Call the malloc() wrapper.
}

void memDel(void* p) {
   free(p);
   // Call the free() wrapper.
}

Main.c

#include <malloc.h>
#include "LibWrap.h"

int main() {
   struct mallinfo beforeMalloc = mallinfo();
   printf("Bytes allocated before malloc: %d\n",beforeMalloc.uordblks);

   void* p = memAlloc(57);
   struct mallinfo afterMalloc = mallinfo();
   printf("Bytes allocated after malloc: %d\n",afterMalloc.uordblks);

   memDel(p);
   struct mallinfo afterFree = mallinfo();
   printf("Bytes allocated after free: %d\n",afterFree.uordblks);

   return 0;
}

First compile LibWrap.c into a shared library:

$ gcc -Wall -Werror -fPIC -shared -o libWrap.so LibWrap.c

Next compile Main.c and link it against the libWrap.so that we just created:

$ gcc -Wall -Werror -o Main Main.c ./libWrap.so -ldl

Time to run the program!

$ ./Main
Bytes allocated before malloc: 0
Bytes allocated after malloc: 80
Bytes allocated after free: 0

So far, so good. No surprises. We allocated a bunch of memory and then freed it. The statistics returned by mallinfo() confirm this.

Out of curiosity, let’s look at ldd output for the application binary we created.

$ ldd Main
       linux-vdso.so.1 =>  (0x00007fff1b1fe000)
       ./libWrap.so (0x00007fe7d2755000)
       libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe7d2542000)
       libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe7d217c000)
       /lib64/ld-linux-x86-64.so.2 (0x00007fe7d2959000)

Take note of the relative placement of libWrap.so with respect to libc.so.6libWrap.socomes before libc.so.6. Remember this. It will be important later.

Now for fun, let’s re-compile Main.c with libc.so.6 explicitly specified on the command-line and coming before libWrap.so:

$ gcc -Wall -Werror -o Main Main.c /lib/x86_64-linux-gnu/libc.so.6 ./libWrap.so -ldl

Re-run:

$ ./Main
Bytes allocated before malloc: 0
Bytes allocated after malloc: 80
Bytes allocated after free: 80

Uh oh, why are we leaking memory all of a sudden? We de-allocate everything we allocate, so why the memory leak?

It turns out that the leak is occurring because we are not actually forwarding malloc() and free() calls to libc.so.6‘s implementations. Instead, we are forwarding them to malloc() and free() inside ld-linux-x86-64.so.2!

“What are you talking about?!” you might be asking.

Well, it just so happens that ld-linux-x86-64.so.2, which is the dynamic linker/loader, has its own copy of malloc() and free(). Why? Because ld-linux has to allocate memory from the heap before it loads libc.so.6. But the version of malloc/free that ld-linuxhas does not actually free memory!

[RTLD_NEXT] will find the next occurrence of a function in the search order after the current library. This allows one to provide a wrapper around a function in another shared library.But why does libWrap.so forward calls to ld-linux instead of libc? The answer comes down to how dlsym() searches for symbols when RTLD_NEXT is specified. Here’s the relevant excerpt from the dlsym(3) man page:— dlsym(3)

To understand this better, take a look at ldd output for the new Main binary:

$ ldd Main
        linux-vdso.so.1 =>  (0x00007fffe1da0000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f32c2e91000)
        ./libWrap.so (0x00007f32c2c8f000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f32c2a8a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f32c3267000)

Unlike earlierlibWrap.so comes after libc.so.6. So when dlsym() is called inside libWrap.so to search for functions, it skips libc.so.6 since it precedes libWrap.so in the search order list. That means the searches continue through to ld-linux-x86-64.so.2where they find linker/loader’s malloc/free and return pointers to those functions. And so, libWrap.so ends up forwading calls to ld-linux instead of libc!

The answer is unfortunately no. At OptumSoft, we recently encountered this very same memory leak with a binary compiled using the standard ./configure && make on x86-64 Ubuntu 14.04.1 LTS. For reasons we don’t understand, the linking order for the binary was such that using dlsym() with RTLD_NEXT to lookup malloc/free resulted in pointers to implementations inside ld-linux. It took a ton of effort and invaluable help from Mozilla’s rr tool to root-cause the issue. After the whole ordeal, we decided to write a blog post about this strange behavior in case someone else encounters it in the future.At this point you might be wondering: We ran a somewhat funky command to build our application and then encountered a memory leak due to weird library linking order caused by said command. Isn’t this whole thing a silly contrived scenario?


Solution

If you find dlsym() with RTLD_NEXT returning pointers to malloc/free inside ld-linux, what can you do?

For starters, you need to detect that a function address indeed does belong to ld-linuxusing dladdr():

void* func = dlsym(RTLD_NEXT,"malloc");
Dl_info dlInfo;
if(!dladdr(func,&dlInfo)) {
   // dladdr() failed.
}
if(strstr(dlInfo.dli_fname,"ld-linux")) {
   // ‘malloc‘ is inside linker/loader.
}

Once you have figured out that a function is inside ld-linux, you need to decide what to do next. Unfortunately, there is no straightforward way to continue searching for the same function name in all other libraries. But if you know the name of a specific library in which the function exists (e.g. libc), you can use dlopen() and dlsym() to fetch the desired pointer:

void* handle = dlopen("libc.so.6",RTLD_LAZY);
// NOTE: libc.so.6 may *not* exist on Alpha and IA-64 architectures.
if(!handle) {
   // dlopen() failed.
}
void* func = dlsym(handle,"free");
if(!func) {
   // Bad! ‘free‘ was not found inside libc.
}

Summary

  • One can use dlsym() with RTLD_NEXT to implement wrappers around malloc() and free().
  • Due to unexpected linking behavior, dlsym() when using RTLD_NEXT can return pointers to malloc/free implementations inside ld-linux (dynamic linker/loader). Using ld-linux‘s malloc/free for general heap allocations leads to memory leaks because that particular version of free() doesn’t actually release memory.
  • You can check if an address returned by dlsym() belongs to ld-linux via dladdr(). You can also lookup a function in a specific library using dlopen() and dlsym().

From: http://optumsoft.com/dangers-of-using-dlsym-with-rtld_next/

时间: 2024-08-07 20:05:20

[转] Dangers of using dlsym() with RTLD_NEXT的相关文章

linux--函数劫持--基于LD_PRELOAD

Recently i am facing a problem, how to differentiate a problem of library-function from application problems.for solving this problem, we need to know some knowledge about share-library and basics in Linux.For dynamic libraries, they are loaded to me

加载动态链接库——dlopen dlsym dlclose

DLOPEN?DLMOPEN?DLCLOSE NAME ????dlclose, dlopen, dlmopen - 打开/关闭共享对象 SYNOPSIS #include <dlfcn.h> void *dlopen(const char *filename, int flags); int dlclose(void *handle); #define _GNU_SOURCE #include <dlfcn.h> void *dlmopen (Lmid_t lmid, const

采用dlopen、dlsym、dlclose加载动态链接库【总结】

摘自http://www.cnblogs.com/Anker/p/3746802.html 采用dlopen.dlsym.dlclose加载动态链接库[总结] 1.前言 为了使程序方便扩展,具备通用性,可以采用插件形式.采用异步事件驱动模型,保证主程序逻辑不变,将各个业务已动态链接库的形式加载进来,这就是所谓的插件.linux提供了加载和处理动态链接库的系统调用,非常方便.本文先从使用上进行总结,涉及到基本的操作方法,关于动态链接库的本质及如何加载进来,需要进一步学习,后续继续补充.如何将程序设

dlsym

在Android源码中发现,会如下使用: dlsym(RTLD_DEFAULT, name); 也就是说 handle=RTLD_DEFAULT,在网上查了下,大致是说会在当前进程中按照 default library search order搜索name这个symbol.其中RTLD_DEFAULT=0xffff ffff,即-1 dlsym,布布扣,bubuko.com

LINUX下动态链接库的使用-dlopen dlsym dlclose dlerror(转)

dlopen 基本定义 功能:打开一个动态链接库  包含头文件:  #include <dlfcn.h>  函数定义:  void * dlopen( const char * pathname, int mode );  函数描述:  在dlopen的()函数以指定模式打开指定的动态连接库文件,并返回一个句柄给调用进程.使用dlclose()来卸载打开的库.  mode:分为这两种  RTLD_LAZY 暂缓决定,等有需要时再解出符号  RTLD_NOW 立即决定,返回前解除所有未决定的符号

【转】采用dlopen、dlsym、dlclose加载动态链接库【总结】

1.前言 为了使程序方便扩展,具备通用性,可以采用插件形式.采用异步事件驱动模型,保证主程序逻辑不变,将各个业务已动态链接库的形式加载进来,这就是所谓的插件.linux提供了加载和处理动态链接库的系统调用,非常方便.本文先从使用上进行总结,涉及到基本的操作方法,关于动态链接库的本质及如何加载进来,需要进一步学习,后续继续补充.如何将程序设计为插件形式,挖掘出主题和业务之间的关系,需要进一步去学习. 2.生产动态链接库 编译参数 gcc -fPIC -shared  例如将如下程序编译为动态链接库

the example of dlsym

void *handle; int i, (*fptr)(int); /* open the needed object */ handle = dlopen("/usr/home/me/libfoo.so", RTLD_LOCAL | RTLD_LAZY); /* find the address of function and data objects */ *(void **)(&fptr) = dlsym(handle, "my_function")

(十二)插件之dlopen/dlsym/dlclose 加载动态链接库

dlopen, dlsym, dlclose 加载动态链接库 参考: 采用dlopen.dlsym.dlclose加载动态链接库[总结] linux动态库加载的秘密 三种思路:解决动态库版本兼容 1. 插件 插件(Plug-in 又译外挂)是一种遵循一定规范的应用程序接口编写出来的程序. 应用软件提供使插件能够应用的各项服务,其中包括提供加载方式,使插件可以加载到应用程序和网络传输协议中,从而和插件进行数据交换. 插件必须依赖于应用程序才能发挥自身功能,仅靠插件是无法正常运行的.相反地,应用程序

The Dangers of JavaScript’s Automatic Semicolon Insertion

Although JavaScript is very powerful, the language’s fundamentals do not have a very steep learning curve.  Prior to the explosion of web applications, JavaScript was thought of as a toy language for amateur programmers.  Some of JavaScript’s feature