利用gdb[i]调试nginx[ii]和利用gdb调试其他程序没有两样,只是nginx能够是daemon程序,也能够以多进程执行,因此利用gdb调试和寻常会有些许不一样。
当然,我们能够选择将nginx设置为非daemon模式并以单进程执行。而这需做例如以下设置就可以:
daemon off;
master_process off;
这是第一种情况:
这样的设置下的nginx在gdb下调试非常普通,过程能够[iii]是这样:
运行命令:
[email protected]:/usr/local/nginx/sbin$ sudo gdb ./nginx
当前文件夹是在/usr/local/nginx/sbin,该文件夹下有运行程序nginx。上条命令也就是用gdb来開始调试nginx,进入gdb命令行,直接输入r就可以运行nginx:
(gdb) r
Starting program: /usr/local/nginx/sbin/nginx
src/core/ngx_conf_file.c 1163 : 1000000
===============================
ngx_timer_resolution 1000000
由于nginx曾经台形式执行。键盘输入被nginx接管,此时无法输入gdb命令,因此须要按Ctrl+C退到gdb命令行模式,而nginx被中断暂停在__kernel_vsyscall调用里,输入bt命令能够查看到调用堆栈信息:
(gdb) r
Starting program: /usr/local/nginx/sbin/nginx
src/core/ngx_conf_file.c 1163 : 1000000
===============================
ngx_timer_resolution 1000000
^C
Program received signal SIGINT, Interrupt.
0xb7f29430 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7f29430 in __kernel_vsyscall ()
#1 0xb7e081a8 in epoll_wait () from /lib/tls/i686/cmov/libc.so.6
#2 0x08073ea3 in ngx_epoll_process_events (cycle=0x860ad60, timer=4294967295,
flags=0) at src/event/modules/ngx_epoll_module.c:405
#3 0x080668ee in ngx_process_events_and_timers (cycle=0x860ad60)
at src/event/ngx_event.c:253
#4 0×08071618 in ngx_single_process_cycle (cycle=0x860ad60)
at src/os/unix/ngx_process_cycle.c:283
#5 0x0804a926 in main (argc=1, argv=0xbfd2ae44) at src/core/nginx.c:372
(gdb)
好,来给nginx设置断点。这非常easy,比方这里给函数ngx_process_events_and_timers设置个断点,再输入命令c继续运行nginx:
#4 0×08071618 in ngx_single_process_cycle (cycle=0x860ad60)
at src/os/unix/ngx_process_cycle.c:283
#5 0x0804a926 in main (argc=1, argv=0xbfd2ae44) at src/core/nginx.c:372
(gdb) b ngx_process_events_and_timers
Breakpoint 1 at 0x80667d6: file src/event/ngx_event.c, line 200.
(gdb) c
Continuing.
Breakpoint 1, ngx_process_events_and_timers (cycle=0x860ad60)
at src/event/ngx_event.c:200
200 {
(gdb)
结果我这里gdb立即提示断点中断,这是由于恰好调到这个函数而已。或许在你那并不会立即中断。从而gdb就一直停在Continuing后面。此时我们就要主动去触发事件(或者等待,时间可能非常久,依据你的nginx设置)使得nginx调用这个函数。比方利用浏览器去訪问nginx监听的网站,使得nginx有事件发生。
断点中断后,利用s、c等gdb常规命令进行我们的调试,跟踪。这些无需多说:
200 {
(gdb) s
__cyg_profile_func_enter (this=0x80667d0, call=0×8071618)
at src/core/my_debug.c:65
warning: Source file is more recent than executable.
65 my_debug_print(“Enter\n%p\n%p\n”, call, this);
(gdb)
66 }
(gdb) c
Continuing.
ngx_timer_resolution 1000000
另外一种情况:
当以daemon模式(即配置选项:daemon on;)单进程执行nginx时,上面那种方法不凑效了。原因非常easy,当nginx以daemon模式执行时,这个启动过程中nginx将fork好几次。而那些父进程都早已直接退出了,因此看到的会这样:
[email protected]:/usr/local/nginx/sbin$ sudo gdb ./nginx
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type “show copying”
and “show warranty” for details.
This GDB was configured as “i486-linux-gnu”…
(gdb) r
Starting program: /usr/local/nginx/sbin/nginx
src/core/ngx_conf_file.c 1163 : 1000000
===============================
Program exited normally.
(gdb)
而其实nginx并没有退出,退出的是其父(或祖父)进程,而gdb没有跟着fork跟踪下去:
[email protected]:/usr/local/nginx/sbin$ pidof nginx
7908
gdb考虑了这样的情况,因此也提供了对应的follow-fork-mode选项:
其使用方法为:
set follow-fork-mode [parent|child]
说明:
parent: fork之后继续调试父进程。子进程不受影响。
child: fork之后调试子进程。父进程不受影响。
设置了该选项后,其他调式操作和之前介绍的基本一样,相关示比例如以下:
[email protected]:/usr/local/nginx/sbin$ sudo gdb ./nginx -q
(gdb) r
Starting program: /usr/local/nginx/sbin/nginx
src/core/ngx_conf_file.c 1163 : 1000000
===============================
Program exited normally.
(gdb) shell pidof nginx
9723
(gdb) shell kill -9 9723
(gdb) set follow-fork-mode child
(gdb) b ngx_process_events_and_timers
Breakpoint 1 at 0x80667d6: file src/event/ngx_event.c, line 200.
(gdb) r
Starting program: /usr/local/nginx/sbin/nginx
src/core/ngx_conf_file.c 1163 : 1000000
===============================
[Switching to process 9739]
Breakpoint 1, ngx_process_events_and_timers (cycle=0x8ff0d60)
at src/event/ngx_event.c:200
200 {
(gdb) s
[tcsetpgrp failed in terminal_inferior: No such process]
__cyg_profile_func_enter (this=0x80667d0, call=0×8071618)
at src/core/my_debug.c:65
warning: Source file is more recent than executable.
65 my_debug_print(“Enter\n%p\n%p\n”, call, this);
(gdb) c
Continuing.
上面我们先设置断点,这样执行r命令后nginx将自己主动被中断返回到gdb命令行。假设不这样做,那么将无法返回gdb命令行。即按Ctrl+C没有效果,由于nginx是以daemon模式执行的,输入输出已经被重定向,nginx没有接收到这个Ctrl+C输入。
当然也并非没有其他办法。比方另外一个shell窗体。利用kill命令给nginx进程发送一个SIGINT信号就可以,这和按Ctrl+C是一样的,例如以下(假定9897为nginx进程id号):
[email protected]:~$ sudo kill -2 9897
另外一种情况:
第三种情况就是无论nginx执行模式了,是通用的,即利用gdb的attach、detach命令。
假定当前nginx的相关配置这样:
daemon on;
master_process on;
worker_processes 1;
即daemon模式。多进程执行nginx。
[email protected]:/usr/local/nginx/sbin$ sudo gdb -q
(gdb) shell ./nginx
src/core/ngx_conf_file.c 1163 : 1000000
===============================
(gdb) shell pidof nginx
10246 10245
(gdb)
能够看到一共同拥有两个nginx进程,这里10245为master进程。而10246为worker进程。
我们要调试worker 10246能够例如以下:
(gdb) attach 10246
Attaching to program: /usr/local/nginx/sbin/nginx, process 10246
Reading symbols from /lib/tls/i686/cmov/libcrypt.so.1…done.
Loaded symbols for /lib/tls/i686/cmov/libcrypt.so.1
Reading symbols from /lib/libpcre.so.3…done.
Loaded symbols for /lib/libpcre.so.3
Reading symbols from /usr/lib/libz.so.1…done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/i686/cmov/libc.so.6…done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /lib/ld-linux.so.2…done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/i686/cmov/libnss_compat.so.2…done.
Loaded symbols for /lib/tls/i686/cmov/libnss_compat.so.2
Reading symbols from /lib/tls/i686/cmov/libnsl.so.1…done.
Loaded symbols for /lib/tls/i686/cmov/libnsl.so.1
Reading symbols from /lib/tls/i686/cmov/libnss_nis.so.2…done.
Loaded symbols for /lib/tls/i686/cmov/libnss_nis.so.2
Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2…done.
Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2
0xb8016430 in __kernel_vsyscall ()
(gdb) b ngx_process_events_and_timers
Breakpoint 3 at 0x80667d6: file src/event/ngx_event.c, line 200.
(gdb) c
Continuing.
Breakpoint 3, ngx_process_events_and_timers (cycle=0x8f43d68) at src/event/ngx_event.c:200
200 {
(gdb) detach
Detaching from program: /usr/local/nginx/sbin/nginx, process 10246
(gdb)
总的来看,还是第三种方法最顺手了。
[i] http://www.gnu.org/software/gdb/
[ii] nginx是打开了-g选项编译的。
[iii] 意思是说以下的方法仅仅是众多方法之中的一个。