1.fork()函数概述
在Linux程序中,用fork()可以创建一个子进程,具体而言:
- 调用fork()时,会创建当前进程的一份拷贝;
- 当前进程称为父进程(parentprocess),新创建的进程称为子进程(childprocess);
- 从fork()调用点开始,父子进程都开始执行。
2.区分父子进程
2.1fork()函数
[email protected]:~$man fork | more
FORK(2) Linux Programmer‘s Manual FORK(2)
NAME
fork- create a child process
SYNOPSIS
#include<unistd.h>
pid_tfork(void);
DESCRIPTION
fork() creates a new process by duplicating the calling process. The
newprocess, referred to as the child, is an exact duplicate of the
calling process, referred to as the parent, except for the following
points:
* The child has its own unique process ID, and this PID does not match
theID of any existing process group (setpgid(2)).
* The child‘s parent process ID is the same as the parent‘s process
ID.
* The child does not inherit its parent‘s memory locks (mlock(2),
mlockall(2)).
* Process resource utilizations (getrusage(2)) and CPU time counters
(times(2))are reset to zero in the child.
* The child‘s set of pending signals is initially empty (sigpend‐
ing(2)).
* The child does not inherit semaphore adjustments from its parent
(semop(2)).
* The child does not inherit record locks from its parent (fcntl(2)).
* The child does not inherit timers from its parent (setitimer(2),
alarm(2),timer_create(2)).
* The child does not inherit outstanding asynchronous I/O operations
fromits parent (aio_read(3), aio_write(3)), nor does it inherit any
asynchronousI/O contexts from its parent (see io_setup(2)).
The process attributes in the preceding list are all specified in
POSIX.1-2001. The parent and child also differ with respect to the
followingLinux-specific process attributes:
* The child does not inherit directory change notifications (dnotify)
fromits parent (see the description of F_NOTIFY in fcntl(2)).
* The prctl(2) PR_SET_PDEATHSIG setting is reset so that the child
doesnot receive a signal when its parent terminates.
* Memory mappings that have been marked with the madvise(2) MADV_DONT‐
FORKflag are not inherited across a fork().
* The termination signal of the child is always SIGCHLD (see
clone(2)).
Notethe following further points:
* The child process is created with a single thread—the onethat
calledfork(). The entire virtual address space of the parent is
replicated in the child, including the states of mutexes, condition
variables,and other pthreads objects; the use of pthread_atfork(3)
maybe helpful for dealing with problems that this can cause.
* The child inherits copies of the parent‘s set of open file descrip‐
tors. Each file descriptor in the child refers to the same open
file description (see open(2)) as the corresponding file descriptor
inthe parent. This means that the two descriptors share open file
status flags, current file offset, and signal-driven I/O attributes
(seethe description of F_SETOWN and F_SETSIG in fcntl(2)).
* The child inherits copies of the parent‘s set of open message queue
descriptors (see mq_overview(7)). Each descriptor in the child
refersto the same open message queue description as the correspond‐
ing descriptor in the parent. This means that the two descriptors
sharethe same flags (mq_flags).
* The child inherits copies of the parent‘s set of open directory
streams (see opendir(3)). POSIX.1-2001 says that the corresponding
directorystreams in the parent and child may share the directory
streampositioning; on Linux/glibc they do not.
RETURNVALUE
Onsuccess, the PID of the child process is returned in the parent, and
0is returned in the child. On failure, -1 is returned in the parent,
nochild process is created, and errno is set appropriately.
ERRORS
EAGAINfork() cannot allocate sufficient memory to copy the parent‘s
pagetables and allocate a task structure for the child.
EAGAINIt was not possible to create a new process because the caller‘s
RLIMIT_NPROC resource limit was encountered. To exceed this
limit,the process must have either the CAP_SYS_ADMIN or the
CAP_SYS_RESOURCEcapability.
ENOMEMfork() failed to allocate the necessary kernel structures
becausememory is tight.
CONFORMINGTO
SVr4,4.3BSD, POSIX.1-2001.
NOTES
UnderLinux, fork() is implemented using copy-on-write pages, so the
only penalty that it incurs is the time and memory required to dupli‐
catethe parent‘s page tables, and to create a unique task structure
forthe child.
Since version 2.3.3, rather than invoking the kernel‘s fork() system
call,the glibc fork() wrapper that is provided as part of the NPTL
threading implementation invokes clone(2) with flags that provide the
sameeffect as the traditional system call. The glibc wrapper invokes
anyfork handlers that have been established using pthread_atfork(3).
EXAMPLE
Seepipe(2) and wait(2).
SEEALSO
clone(2), execve(2), setrlimit(2), unshare(2), vfork(2), wait(2), dae‐
mon(3),capabilities(7), credentials(7)
COLOPHON
Thispage is part of release 3.35 of the Linux man-pages project. A
description of the project, and information about reporting bugs, can
befound at http://man7.org/linux/man-pages/.
Linux 2009-04-27 FORK(2)
[email protected]:~$
2.2判别方法
当fork()之后,需要区分父进程和子进程,以便执行各自正确的路径。
具体来讲,根据fork()返回的pid的值来区分父子进程:返回者为0,表示该进程是子进程;如果大于0则表示父进程。当返回-1的时候表示fork()调用异常。
3.示例
这里取《AdvancedLinux Programming》Listing3.3的代码:
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
intmain()
{
pid_tchild_pid;
printf("themain program process ID is %d\n", (int)getpid());
child_pid= fork();
if(child_pid != 0) {
printf("thisis the parent process, with id %d\n", (int)getpid());
printf("thechild‘s process ID is %d\n", (int)child_pid);
}else {
printf("thisis the child process, with id %d\n", (int)getpid());
}
return0;
}
执行结果:
[email protected]:~/examples/cpp/fork$gcc fork_list3_3.c
[email protected]:~/examples/cpp/fork$./a.out
themain program process ID is 3161
thisis the parent process, with id 3161
thechild‘s process ID is 3162
thisis the child process, with id 3162
[email protected]:~/examples/cpp/fork$
4.进程等待
4.1退出时机
fork()之后,父进程和子进程谁先执行完?或者说,哪个进程先结束?
答案是,由父子进程本身的代码决定;并不是说子进程先结束、然后父进程再结束。
为此,我们给出一个例子进行说明。其中父进程sleep10秒钟,子进程sleep20秒。同时,用ps命令来观测当前存在的进程列表。
在ALPListing 3.3的基础上,增加两个sleep()调用,修改如下:
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
intmain()
{
pid_tchild_pid;
printf("themain program process ID is %d\n", (int)getpid());
child_pid= fork();
if(child_pid != 0) {
printf("thisis the parent process, with id %d\n", (int)getpid());
printf("thechild‘s process ID is %d\n", (int)child_pid);
sleep(10);
}else {
printf("thisis the child process, with id %d\n", (int)getpid());
sleep(20);
}
return0;
}
我们打开两个终端,一个执行上面的程序,另外一个不断用ps观察进程。
第一个终端的执行结果:
[email protected]:~/examples/cpp/fork$./a.out
themain program process ID is
3392
thisis the parent process, with id 3392
thechild‘s process ID is
3393
thisis the child process, with id 3393
[email protected]:~/examples/cpp/fork$
另外一个终端的观测结果:
[email protected]:~$ps -a
PIDTTY TIME CMD
3363pts/3 00:00:00 man
3374pts/3 00:00:00 pager
3392pts/0 00:00:00 a.out
3393pts/0 00:00:00 a.out
3394pts/1 00:00:00 ps
[email protected]:~$ps -a
PIDTTY TIME CMD
3363pts/3 00:00:00 man
3374pts/3 00:00:00 pager
3393pts/0 00:00:00 a.out
3395pts/1 00:00:00 ps
[email protected]:~$ps -a
PIDTTY TIME CMD
3363pts/3 00:00:00 man
3374pts/3 00:00:00 pager
3398pts/1 00:00:00 ps
[email protected]:~$
可以看到,子进程先结束,而后父进程结束。
4.2父进程等待子进程
实际项目中,父进程往往需要等待子进程结束,然后决定后续处理流程;进一步地,需要了解子进程是如何退出的:正常退出、异常退出,等等。
为此,需要使用waitfamily系统调用。
waitfamily有4种形式:
- wait():父进程等待其中一个子进程退出(exit或异常终止);
- waitpid():父进程等待指定的子进程退出;
- wait3()& wait4():检查子进程的状态,比如资源信息。
使用较多的是waitpid()。
4.3子进程退出的几种方式
有如下几种退出方式:
- 子进程调用exit()或return;
- 子进程异常退出,比如除零错误等;
- 其他异常终止???。
对于这几种方式,父进程可以wait()获取退出方式。以下给出每一种方式的代码示例,但总的调用形式如下:
intstatus;
waitpid(child_pid,&status, 0);
/*see "man waitpid" for detail */
if(WIFEXITED(status)) {
printf("exited,status=%d\n", WEXITSTATUS(status));
}else if (WIFSIGNALED(status)) {
printf("killedby signal %d\n", WTERMSIG(status));
}else if (WIFSTOPPED(status)) {
printf("stoppedby signal %d\n", WSTOPSIG(status));
}else {
printf("unknown\n");
}
4.3.1exit()或return
exit()或return均属于正常退出,也是子进程最常见的一种退出方式。对应于:
if(WIFEXITED(status)) {
printf("exited,status=%d\n", WEXITSTATUS(status));
}
return的示例:
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
#include<sys/wait.h>
#include<stdlib.h>
voidchild_foo(int exit_code)
{
printf("child_foo()\n");
exit(exit_code);// or return exit_code
}
intget_exit_code(int argc, const char* argv[])
{
if(argc != 2) {
printf("Usage:%s exit_code\n", argv[0]);
exit(-1);
}
//ignore other exceptions
returnatoi(argv[1]);
}
intmain(int argc, const char* argv[])
{
pid_tchild_pid;
int exit_code = get_exit_code(argc, argv);
printf("themain program process ID is %d\n", (int)getpid());
child_pid= fork();
/*child process */
if(child_pid == 0) {
child_foo(exit_code);
}
/*parent process */
intstatus;
waitpid(child_pid,&status, 0);
/*see "man waitpid" for detail */
if(WIFEXITED(status)) {
printf("exited,status=%d\n", WEXITSTATUS(status));
}else if (WIFSIGNALED(status)) {
printf("killedby signal %d\n", WTERMSIG(status));
}else if (WIFSTOPPED(status)) {
printf("stoppedby signal %d\n", WSTOPSIG(status));
}else {
printf("unknown\n");
}
return0;
}
执行结果:
[email protected]:~/examples/cpp/wait$./a.out 1
themain program process ID is 2635
child_foo()
exited,status=1
[email protected]:~/examples/cpp/wait$./a.out 12
themain program process ID is 2637
child_foo()
exited,status=12
[email protected]:~/examples/cpp/wait$
在有了上面的示例之后,我们给出ALP3.4 Process Termination的一段话:
Normally,a process terminates in one of two ways. Either the executing programcalls the
exit function, or the program’s mainfunction returns. Each process has an exit code: a number that theprocess returns to its parent. The exit code is the argument passedto the
exit function, or the value returned from main.
4.3.2异常返回&信号
进程还会因为异常而退出,比如除零异常(SIGFPE)、段异常(SIGSEGV,SegmentFault)、abort()对应的SIGABRT异常,等等。
将上面的代码修改如下:
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
#include<sys/wait.h>
#include<stdlib.h>
voidtest_SIGFPE()
{
inti;
for(i = 10; i >= 0; i--) {
printf("%d%d\n", i, 100 / i);
}
}
voidtest_SIGABRT()
{
printf("thechild process will abort.\n");
abort();
printf("unreachablestatement.\n");
}
intget_abnormal_type(int argc, const char* argv[])
{
inttype;
if(argc != 2 || (type = atoi(argv[1]), type != 1 && type != 2)){
printf("Usage:%s [1|2]\n", argv[0]);
printf(" 1: SIGFPE, 2: SIGABRT\n");
exit(-1);
}
returntype;
}
intmain(int argc, const char* argv[])
{
pid_tchild_pid;
int type = get_abnormal_type(argc, argv);
child_pid= fork();
/*child process */
if(child_pid == 0) {
type== 1 ? test_SIGFPE() : test_SIGABRT();
return0; // unreachable
}
/*parent process */
intstatus;
waitpid(child_pid,&status, 0);
/*see "man waitpid" for detail */
if(WIFEXITED(status)) {
printf("exited,status=%d\n", WEXITSTATUS(status));
}else if (WIFSIGNALED(status)) {
printf("killedby signal %d\n", WTERMSIG(status));
}else if (WIFSTOPPED(status)) {
printf("stoppedby signal %d\n", WSTOPSIG(status));
}else {
printf("unknown\n");
}
return0;
}
执行结果:
[email protected]:~/examples/cpp/wait$./a.out 1
1010
911
812
714
616
520
425
333
250
1100
killedby signal 8
[email protected]:~/examples/cpp/wait$./a.out 2
thechild process will abort.
killedby signal 6
[email protected]:~/examples/cpp/wait$
各个信号的定义在/usr/include/asm-generic/signal.h,如下:
[email protected]:~$cat /usr/include/asm-generic/signal.h
#ifndef__ASM_GENERIC_SIGNAL_H
#define__ASM_GENERIC_SIGNAL_H
#include<linux/types.h>
#define_NSIG 64
#define_NSIG_BPW __BITS_PER_LONG
#define_NSIG_WORDS (_NSIG / _NSIG_BPW)
#defineSIGHUP 1
#defineSIGINT 2
#defineSIGQUIT 3
#defineSIGILL 4
#defineSIGTRAP 5
#defineSIGABRT 6
#defineSIGIOT 6
#defineSIGBUS 7
#defineSIGFPE 8
#defineSIGKILL 9
#defineSIGUSR1 10
#defineSIGSEGV 11
#defineSIGUSR2 12
#defineSIGPIPE 13
#defineSIGALRM 14
#defineSIGTERM 15
#defineSIGSTKFLT 16
#defineSIGCHLD 17
#defineSIGCONT 18
#defineSIGSTOP 19
#defineSIGTSTP 20
#defineSIGTTIN 21
#defineSIGTTOU 22
#defineSIGURG 23
#defineSIGXCPU 24
#defineSIGXFSZ 25
#defineSIGVTALRM 26
#defineSIGPROF 27
#defineSIGWINCH 28
#defineSIGIO 29
#defineSIGPOLL SIGIO
/*
#defineSIGLOST 29
*/
#defineSIGPWR 30
#defineSIGSYS 31
#define SIGUNUSED 31
/*These should not be considered constants from userland. */
#defineSIGRTMIN 32
#ifndefSIGRTMAX
#defineSIGRTMAX _NSIG
#endif
通常构造一个异常比较棘手,为此测试的时候可以直接调用kill()函数,传入异常退出对应的信号值。如下:
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
#include<sys/wait.h>
#include<stdlib.h>
/*
Usage:./a.out signal_code
e.g../a.out 5
*/
intmain(int argc, const char* argv[])
{
pid_tchild_pid;
child_pid= fork();
if(child_pid == 0) {
printf("childprocess is sleeping.\n");
sleep(100);
return0;
}
sleep(2);// Give a chance for child process to print a line.
kill(child_pid,atoi(argv[1]));
intstatus;
waitpid(child_pid,&status, 0);
if(WIFEXITED(status)) {
printf("exited,status=%d\n", WEXITSTATUS(status));
}else if (WIFSIGNALED(status)) {
printf("killedby signal %d\n", WTERMSIG(status));
}else if (WIFSTOPPED(status)) {
printf("stoppedby signal %d\n", WSTOPSIG(status));
}else {
printf("unknown\n");
}
return0;
}
5.僵尸进程&孤儿进程
前面简单讨论过父进程和子进程谁先退出的问题,这里进一步讨论。
5.1进程退出信息
前面已讨论,wait&waitpid等函数可获取子进程退出时的一些信息,比如是正常退出还是异常退出。或者说,子进程退出时,Linux内核为每个(已终止的)子进程都保存了一些信息,包括进程ID、进程退出时的状态(exitcode
or signal info),等等。如此父进程调用wait&waitpid的时候,就可以获取这些信息。
5.2僵尸进程
Zombieprocess,
也称defunctprocess。通俗地解释,就是父进程还没有执行完,或父进程调用wait之前,子进程就已经执行完毕(terminated)。进程执行的时候,称为活的;相对应地,执行玩了,或终止执行了,进程就die了,死了,所以就称为zombieprocess
or defunct process。
维基(http://en.wikipedia.org/wiki/Zombie_process)的一段话:
Theterm zombie process derives from the common definition of zombie —an undead person. In the term‘s metaphor, the child process has"died"
but has not yet been "reaped".
5.3孤儿进程
和僵尸进程对应地,称为孤儿进程(orphanprocess)。即父进程执行完了,终止了,但子进程仍在执行。此时,子进程的parent不存在了,所以就称为孤儿进程。
——当然,这里的父进程终止的时候,没有针对子进程调用wait。
在父进程终止的时候,孤儿进程会被init进程所领养(adopted,reparenting)。
5.4示例分析
接下来通过一个例子来说明以上概念。我们通过sleep()的时长来控制父子进程的运行时间,且通过命令行参数输入这两个sleep时长,从而简化示例代码的长度。——为了简化,同样少了许多异常处理流程。代码如下:
5.4.1僵尸进程
[email protected]:~/examples/cpp/zombile_orphan_process$./a.out 20 10
Parentprocess‘s pid: 2944
thisis the parent process, with id 2944
thechild‘s process ID is 2945
thisis the child process, with id 2945
childprocess will be terminated.
parentprocess will be terminated.
[email protected]:~/examples/cpp/zombile_orphan_process$
ps查看两个进程的状态:
1)父子进程均在运行:
2714 2705 Ss bash
2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c
2846 2 S [kworker/0:3]
2870 2705 Ss bash
2943 2 S [kworker/0:2]
2944 2714 S+ ./a.out 20 10
2945 2944 S+ ./a.out 20 10
2947 2870 R+ ps -e -o pid,ppid,stat,cmd
2)子进程终止(僵尸)、父进程仍在运行:
2714 2705 Ss bash
2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c
2846 2 S [kworker/0:3]
2870 2705 Ss bash
2943 2 S [kworker/0:2]
2944 2714 S+ ./a.out 20 10
2945 2944 Z+ [a.out] <defunct>
2948 2870 R+ ps -e -o pid,ppid,stat,cmd
3)父子进程均终止
2714 2705 Ss+ bash
2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c
2846 2 S [kworker/0:3]
2870 2705 Ss bash
2943 2 S [kworker/0:2]
2949 2870 R+ ps -e -o pid,ppid,stat,cmd
5.4.2ps命令显示进程状态码的含义
manps:
PROCESSSTATE CODES
Hereare the different values that the s, stat and state output
specifiers(header "STAT" or "S") will display to describethe state of
aprocess:
D uninterruptible sleep (usually IO)
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped, either by a job control signal or because it is being
traced.
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped byits
parent.
ForBSD formats and when the stat keyword is used, additional
charactersmay be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
+ is in the foreground process group.
5.4.3父进程终止时,对应的僵尸进程的处理
通过上面的例子及ps结果,可以看到当父进程终止时,僵尸进程也消失了(aregone)。其原理在于,当父进程终止时,该父进程的所有子进程都将被init进程所继承。而init程序会自动清除所继承的僵尸进程。
所以,在上面的例子中,当父进程运行结束之后,子进程也自动消失掉了。
5.4.4孤儿进程
接下来再构造孤儿进程的例子。
[email protected]:~/examples/cpp/zombile_orphan_process$./a.out 10 20
Parentprocess‘s pid: 3684
thisis the parent process, with id 3684
thechild‘s process ID is 3685
thisis the child process, with id 3685
parentprocess will be terminated.
[email protected]:~/examples/cpp/zombile_orphan_process$child process will be terminated.
下面是ps观测结果。
1)父子进程均在运行
2714 2705 Ss bash
2821 1 Rl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c
2870 2705 Ss bash
3038 2 S [kworker/1:1]
3079 2 S [kworker/0:3]
3646 2 S [kworker/0:0]
3666 2 S [kworker/0:2]
3684 2714 S+ ./a.out 10 20
3685 3684 S+ ./a.out 10 20
3688 2870 R+ ps -e -o pid,ppid,stat,cmd
2)父进程执行完毕、子进程仍在运行(变成孤儿进程被init收养)
2714 2705 Ss+ bash
2821 1 Rl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c
2870 2705 Ss bash
3038 2 S [kworker/1:1]
3079 2 S [kworker/0:3]
3646 2 S [kworker/0:0]
3666 2 S [kworker/0:2]
3685 1 S ./a.out 10 20
3689 2870 R+ ps -e -o pid,ppid,stat,cmd
3)子进程运行完毕
2714 2705 Ss+ bash
2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c
2870 2705 Ss bash
3038 2 S [kworker/1:1]
3079 2 S [kworker/0:3]
3646 2 S [kworker/0:0]
3666 2 S [kworker/0:2]
3690 2870 R+ ps -e -o pid,ppid,stat,cmd
参考资料
ALP:http://download.csdn.net/download/shmilyy/720746
Linux子进程