Linux进程间通信与线程间同步详解（全面详细）

引用：http://community.csdn.net/Expert/TopicView3.asp?id=4374496
linux下进程间通信的几种主要手段简介：

1. 管道（Pipe）及有名管道（named pipe）：管道可用于具有亲缘关系进程间的通信，有名管道克服了管道没有名字的限制，因此，除具有管道所具有的功能外，它还允许无亲缘关系进程间的通信；
   2. 信号（Signal）：信号是比较复杂的通信方式，用于通知接受进程有某种事件发生，除了用于进程间通信外，进程还可以发送信号给进程本身；linux除了支持Unix早期信号语义函数sigal外，还支持语义符合Posix.1标准的信号函数 sigaction（实际上，该函数是基于BSD的，BSD为了实现可靠信号机制，又能够统一对外接口，用sigaction函数重新实现了signal 函数）；
   3. 报文（Message）队列（消息队列）：消息队列是消息的链接表，包括Posix消息队列system V消息队列。有足够权限的进程可以向队列中添加消息，被赋予读权限的进程则可以读走队列中的消息。消息队列克服了信号承载信息量少，管道只能承载无格式字节流以及缓冲区大小受限等缺点。
   4. 共享内存：使得多个进程可以访问同一块内存空间，是最快的可用IPC形式。是针对其他通信机制运行效率较低而设计的。往往与其它通信机制，如信号量结合使用，来达到进程间的同步及互斥。
   5. 信号量（semaphore）：主要作为进程间以及同一进程不同线程之间的同步手段。
   6. 套接口（Socket）：更为一般的进程间通信机制，可用于不同机器之间的进程间通信。起初是由Unix系统的BSD分支开发出来的，但现在一般可以移植到其它类Unix系统上：Linux和System V的变种都支持套接字。

线程的最大特点是资源的共享性，但资源共享中的同步问题是多线程编程的难点。linux下提供了多种方式来处理线程同步，最常用的是互斥锁、条件变量和信号量。

1）互斥锁（mutex）

通过锁机制实现线程间的同步。同一时刻只允许一个线程执行一个关键部分的代码。

int pthread_mutex_init(pthread_mutex_t *mutex,const pthread_mutex_attr_t *mutexattr);

int pthread_mutex_lock(pthread_mutex *mutex);

int pthread_mutex_destroy(pthread_mutex *mutex);

int pthread_mutex_unlock(pthread_mutex *

(1)先初始化锁init()或静态赋值pthread_mutex_t mutex=PTHREAD_MUTEX_INITIALIER

attr_t有:

PTHREAD_MUTEX_TIMED_NP:其余线程等待队列

PTHREAD_MUTEX_RECURSIVE_NP:嵌套锁,允许线程多次加锁,不同线程,解锁后重新竞争

PTHREAD_MUTEX_ERRORCHECK_NP:检错,与一同,线程请求已用锁,返回EDEADLK;

PTHREAD_MUTEX_ADAPTIVE_NP:适应锁,解锁后重新竞争

(2)加锁,lock,trylock,lock阻塞等待锁,trylock立即返回EBUSY

(3)解锁,unlock需满足是加锁状态,且由加锁线程解锁

(4)清除锁,destroy(此时锁必需unlock,否则返回EBUSY,//Linux下互斥锁不占用内存资源

示例代码

#include <cstdio>
#include <cstdlib>
#include <unistd.h>
#include <pthread.h>
#include "iostream"
using namespace std;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int tmp;
void* thread(void *arg)
{
cout << "thread id is " << pthread_self() << endl;
pthread_mutex_lock(&mutex);
tmp = 12;
cout << "Now a is " << tmp << endl;
pthread_mutex_unlock(&mutex);
return NULL;
}
int main()
{
pthread_t id;
cout << "main thread id is " << pthread_self() << endl;
tmp = 3;
cout << "In main func tmp = " << tmp << endl;
if (!pthread_create(&id, NULL, thread, NULL))
{
cout << "Create thread success!" << endl;
}
else
{
cout << "Create thread failed!" << endl;
}
pthread_join(id, NULL);
pthread_mutex_destroy(&mutex);
return 0;
}

编译： g++ -o thread testthread.cpp -lpthread

说明：pthread库不是Linux系统默认的库，连接时需要使用静态库libpthread.a，所以在使用pthread_create()创建线程，以及调用pthread_atfork()函数建立fork处理程序时，需要链接该库。在编译中要加 -lpthread参数。

2）条件变量（cond）

利用线程间共享的全局变量进行同步的一种机制。条件变量上的基本操作有：触发条件(当条件变为 true 时)；等待条件，挂起线程直到其他线程触发条件。

int pthread_cond_init(pthread_cond_t *cond,pthread_condattr_t *cond_attr);

int pthread_cond_wait(pthread_cond_t *cond,pthread_mutex_t *mutex);

int pthread_cond_timewait(pthread_cond_t *cond,pthread_mutex *mutex,const timespec *abstime);

int pthread_cond_destroy(pthread_cond_t *cond);

int pthread_cond_signal(pthread_cond_t *cond);

int pthread_cond_broadcast(pthread_cond_t *cond); //解除所有线程的阻塞

(1)初始化.init()或者pthread_cond_t cond=PTHREAD_COND_INITIALIER（前者为动态初始化，后者为静态初始化）;属性置为NULL

(2)等待条件成立.pthread_wait,pthread_timewait.wait()释放锁,并阻塞等待条件变量为真，timewait()设置等待时间,仍未signal,返回ETIMEOUT(加锁保证只有一个线程wait)

(3)激活条件变量:pthread_cond_signal,pthread_cond_broadcast(激活所有等待线程)

(4)清除条件变量:destroy;无线程等待,否则返回EBUSY

对于

int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);

int pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex, const struct timespec *abstime);

一定要在mutex的锁定区域内使用。

如果要正确的使用pthread_mutex_lock与pthread_mutex_unlock，请参考

pthread_cleanup_push和pthread_cleanup_pop宏，它能够在线程被cancel的时候正确的释放mutex！

另外，posix1标准说，pthread_cond_signal与pthread_cond_broadcast无需考虑调用线程是否是mutex的拥有者，也就是说，可以在lock与unlock以外的区域调用。如果我们对调用行为不关心，那么请在lock区域之外调用吧。

说明：

(1)pthread_cond_wait 自动解锁互斥量(如同执行了pthread_unlock_mutex)，并等待条件变量触发。这时线程挂起，不占用CPU时间，直到条件变量被触发（变量为ture）。在调用 pthread_cond_wait之前，应用程序必须加锁互斥量。pthread_cond_wait函数返回前，自动重新对互斥量加锁(如同执行了pthread_lock_mutex)。

(2)互斥量的解锁和在条件变量上挂起都是自动进行的。因此，在条件变量被触发前，如果所有的线程都要对互斥量加锁，这种机制可保证在线程加锁互斥量和进入等待条件变量期间，条件变量不被触发。条件变量要和互斥量相联结，以避免出现条件竞争——个线程预备等待一个条件变量，当它在真正进入等待之前，另一个线程恰好触发了该条件（条件满足信号有可能在测试条件和调用pthread_cond_wait函数（block）之间被发出，从而造成无限制的等待）。

(3)pthread_cond_timedwait 和 pthread_cond_wait 一样，自动解锁互斥量及等待条件变量，但它还限定了等待时间。如果在abstime指定的时间内cond未触发，互斥量mutex被重新加锁，且pthread_cond_timedwait返回错误 ETIMEDOUT。abstime 参数指定一个绝对时间，时间原点与 time 和 gettimeofday 相同：abstime = 0 表示 1970年1月1日00:00:00 GMT。

(4)pthread_cond_destroy 销毁一个条件变量，释放它拥有的资源。进入 pthread_cond_destroy 之前，必须没有在该条件变量上等待的线程。

(5)条件变量函数不是异步信号安全的，不应当在信号处理程序中进行调用。特别要注意，如果在信号处理程序中调用 pthread_cond_signal 或pthread_cond_boardcast 函数，可能导致调用线程死锁。

示例程序1

#include <stdio.h>
#include <pthread.h>
#include "stdlib.h"
#include "unistd.h"
pthread_mutex_t mutex;
pthread_cond_t cond;
void hander(void *arg)
{
free(arg);
(void)pthread_mutex_unlock(&mutex);
}
void *thread1(void *arg)
{
pthread_cleanup_push(hander, &mutex);
while(1)
{
printf("thread1 is running\n");
pthread_mutex_lock(&mutex);
pthread_cond_wait(&cond,&mutex);
printf("thread1 applied the condition\n");
pthread_mutex_unlock(&mutex);
sleep(4);
}
pthread_cleanup_pop(0);
}
void *thread2(void *arg)
{
while(1)
{
printf("thread2 is running\n");
pthread_mutex_lock(&mutex);
pthread_cond_wait(&cond,&mutex);
printf("thread2 applied the condition\n");
pthread_mutex_unlock(&mutex);
sleep(1);
}
}
int main()
{
pthread_t thid1,thid2;
printf("condition variable study!\n");
pthread_mutex_init(&mutex,NULL);
pthread_cond_init(&cond,NULL);
pthread_create(&thid1,NULL,thread1,NULL);
pthread_create(&thid2,NULL,thread2,NULL);
sleep(1);
do
{
pthread_cond_signal(&cond);
}while(1);
sleep(20);
pthread_exit(0);
return 0;
}

示例程序2：

#include <pthread.h>
#include <unistd.h>
#include "stdio.h"
#include "stdlib.h"
static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
struct node
{
int n_number;
struct node *n_next;
} *head = NULL;
/*[thread_func]*/
static void cleanup_handler(void *arg)
{
printf("Cleanup handler of second thread./n");
free(arg);
(void)pthread_mutex_unlock(&mtx);
}
static void *thread_func(void *arg)
{
struct node *p = NULL;
pthread_cleanup_push(cleanup_handler, p);
while (1)
{
//这个mutex主要是用来保证pthread_cond_wait的并发性
pthread_mutex_lock(&mtx);
while (head == NULL)
{
//这个while要特别说明一下，单个pthread_cond_wait功能很完善，为何
//这里要有一个while (head == NULL)呢？因为pthread_cond_wait里的线
//程可能会被意外唤醒，如果这个时候head != NULL，则不是我们想要的情况。
//这个时候，应该让线程继续进入pthread_cond_wait
// pthread_cond_wait会先解除之前的pthread_mutex_lock锁定的mtx，
//然后阻塞在等待对列里休眠，直到再次被唤醒（大多数情况下是等待的条件成立
//而被唤醒，唤醒后，该进程会先锁定先pthread_mutex_lock(&mtx);，再读取资源
//用这个流程是比较清楚的/*block-->unlock-->wait() return-->lock*/
pthread_cond_wait(&cond, &mtx);
p = head;
head = head->n_next;
printf("Got %d from front of queue/n", p->n_number);
free(p);
}
pthread_mutex_unlock(&mtx); //临界区数据操作完毕，释放互斥锁
}
pthread_cleanup_pop(0);
return 0;
}
int main(void)
{
pthread_t tid;
int i;
struct node *p;
//子线程会一直等待资源，类似生产者和消费者，但是这里的消费者可以是多个消费者，而
//不仅仅支持普通的单个消费者，这个模型虽然简单，但是很强大
pthread_create(&tid, NULL, thread_func, NULL);
sleep(1);
for (i = 0; i < 10; i++)
{
p = (struct node*)malloc(sizeof(struct node));
p->n_number = i;
pthread_mutex_lock(&mtx); //需要操作head这个临界资源，先加锁，
p->n_next = head;
head = p;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mtx); //解锁
sleep(1);
}
printf("thread 1 wanna end the line.So cancel thread 2./n");
//关于pthread_cancel，有一点额外的说明，它是从外部终止子线程，子线程会在最近的取消点，退出
//线程，而在我们的代码里，最近的取消点肯定就是pthread_cond_wait()了。
pthread_cancel(tid);
pthread_join(tid, NULL);
printf("All done -- exiting/n");
return 0;
}

3）信号量

如同进程一样，线程也可以通过信号量来实现通信，虽然是轻量级的。

信号量函数的名字都以"sem_"打头。线程使用的基本信号量函数有四个。

#include <semaphore.h>

int sem_init (sem_t *sem , int pshared, unsigned int value);

这是对由sem指定的信号量进行初始化，设置好它的共享选项（linux 只支持为0，即表示它是当前进程的局部信号量），然后给它一个初始值VALUE。

两个原子操作函数：

int sem_wait(sem_t *sem);

int sem_post(sem_t *sem);

这两个函数都要用一个由sem_init调用初始化的信号量对象的指针做参数。

sem_post：给信号量的值加1；

sem_wait:给信号量减1；对一个值为0的信号量调用sem_wait,这个函数将会等待直到有其它线程使它不再是0为止。

int sem_destroy(sem_t *sem);

这个函数的作用是再我们用完信号量后都它进行清理。归还自己占有的一切资源。

示例代码：

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <semaphore.h>
#include <errno.h>
#define return_if_fail(p) if((p) == 0){printf ("[%s]:func error!/n", __func__);return;}
typedef struct _PrivInfo
{
sem_t s1;
sem_t s2;
time_t end_time;
}PrivInfo;
static void info_init (PrivInfo* thiz);
static void info_destroy (PrivInfo* thiz);
static void* pthread_func_1 (PrivInfo* thiz);
static void* pthread_func_2 (PrivInfo* thiz);
int main (int argc, char** argv)
{
pthread_t pt_1 = 0;
pthread_t pt_2 = 0;
int ret = 0;
PrivInfo* thiz = NULL;
thiz = (PrivInfo* )malloc (sizeof (PrivInfo));
if (thiz == NULL)
{
printf ("[%s]: Failed to malloc priv./n");
return -1;
}
info_init (thiz);
ret = pthread_create (&pt_1, NULL, (void*)pthread_func_1, thiz);
if (ret != 0)
{
perror ("pthread_1_create:");
}
ret = pthread_create (&pt_2, NULL, (void*)pthread_func_2, thiz);
if (ret != 0)
{
perror ("pthread_2_create:");
}
pthread_join (pt_1, NULL);
pthread_join (pt_2, NULL);
info_destroy (thiz);
return 0;
}
static void info_init (PrivInfo* thiz)
{
return_if_fail (thiz != NULL);
thiz->end_time = time(NULL) + 10;
sem_init (&thiz->s1, 0, 1);
sem_init (&thiz->s2, 0, 0);
return;
}
static void info_destroy (PrivInfo* thiz)
{
return_if_fail (thiz != NULL);
sem_destroy (&thiz->s1);
sem_destroy (&thiz->s2);
free (thiz);
thiz = NULL;
return;
}
static void* pthread_func_1 (PrivInfo* thiz)
{
return_if_fail (thiz != NULL);
while (time(NULL) < thiz->end_time)
{
sem_wait (&thiz->s2);
printf ("pthread1: pthread1 get the lock./n");
sem_post (&thiz->s1);
printf ("pthread1: pthread1 unlock/n");
sleep (1);
}
return;
}
static void* pthread_func_2 (PrivInfo* thiz)
{
return_if_fail (thiz != NULL);
while (time (NULL) < thiz->end_time)
{
sem_wait (&thiz->s1);
printf ("pthread2: pthread2 get the unlock./n");
sem_post (&thiz->s2);
printf ("pthread2: pthread2 unlock./n");
sleep (1);
}
return;
}

通过执行结果后，可以看出，会先执行线程二的函数，然后再执行线程一的函数。它们两就实现了同步。在上大学的时候，虽然对这些概念知道，可都没有实践过，所以有时候时间一久就会模糊甚至忘记，到了工作如果还保持这么一种状态，那就太可怕了。虽然现在外面的技术在不断的变化更新，可是不管怎么变，其核心技术还是依旧的，所以我们必须要打好自己的基础，再学习其他新的知识，那时候再学新的知识也会觉得比较简单的。信号量代码摘自http://blog.csdn.net/wtz1985/article/details/3835781

参考：

【1】 http://www.cnblogs.com/feisky/archive/2009/11/12/1601824.html

【2】 http://www.cnblogs.com/mydomain/archive/2011/07/10/2102147.html

【3】线程函数介绍

http://www.unix.org/version2/whatsnew/threadsref.html

【4】 http://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html

【5】线程常用函数简介

http://www.rosoo.net/a/201004/8954.html

【6】条件变量

http://blog.csdn.net/hiflower/article/details/2195350

【7】条件变量函数说明

http://blog.csdn.net/hairetz/article/details/4535920