C++11的thread代码分析

本文分析的是llvm libc++的实现：http://libcxx.llvm.org/

class thread

thread类直接包装了一个pthread_t，在linux下实际是unsigned long int。

class  thread
{
    pthread_t __t_;

    id get_id() const _NOEXCEPT {return __t_;}
}

用了一个std::unique_ptr来包装用户定义的线程函数：

创建线程用的是

template <class _Fp>
void*
__thread_proxy(void* __vp)
{
    __thread_local_data().reset(new __thread_struct);
    std::unique_ptr<_Fp> __p(static_cast<_Fp*>(__vp));
    (*__p)();
    return nullptr;
}

template <class _Fp>
thread::thread(_Fp __f)
{
    std::unique_ptr<_Fp> __p(new _Fp(__f));
    int __ec = pthread_create(&__t_, 0, &__thread_proxy<_Fp>, __p.get());
    if (__ec == 0)
        __p.release();
    else
        __throw_system_error(__ec, "thread constructor failed");
}

thread::joinable() , thread::join(), thread::detach()

再来看下thread::joinable() , thread::join(), thread::detach() 函数。

也是相应调用了posix的函数。在调用join()之后，会把_t设置为0，这样再调用joinable()时就会返回false。对于_t变量没有memory barrier同步，感觉可能会有问题。

bool joinable() const {return __t_ != 0;}
void
thread::join()
{
    int ec = pthread_join(__t_, 0);
    __t_ = 0;
}

void
thread::detach()
{
    int ec = EINVAL;
    if (__t_ != 0)
    {
        ec = pthread_detach(__t_);
        if (ec == 0)
            __t_ = 0;
    }
    if (ec)
        throw system_error(error_code(ec, system_category()), "thread::detach failed");
}

thread::hardware_concurrency()

thread::hardware_concurrency()函数，获取的是当前可用的processor的数量。

调用的是sysconf(_SC_NPROCESSORS_ONLN)函数，据man手册：

- _SC_NPROCESSORS_ONLN

The number of processors currently online (available).

unsigned
thread::hardware_concurrency() _NOEXCEPT
{
    long result = sysconf(_SC_NPROCESSORS_ONLN);
    // sysconf returns -1 if the name is invalid, the option does not exist or
    // does not have a definite limit.
    // if sysconf returns some other negative number, we have no idea
    // what is going on. Default to something safe.
    if (result < 0)
        return 0;
    return static_cast<unsigned>(result);
}

thread::sleep_for和thread::sleep_until

sleep_for函数实际调用的是nanosleep函数：

void
sleep_for(const chrono::nanoseconds& ns)
{
    using namespace chrono;
    if (ns > nanoseconds::zero())
    {
        seconds s = duration_cast<seconds>(ns);
        timespec ts;
        typedef decltype(ts.tv_sec) ts_sec;
        _LIBCPP_CONSTEXPR ts_sec ts_sec_max = numeric_limits<ts_sec>::max();
        if (s.count() < ts_sec_max)
        {
            ts.tv_sec = static_cast<ts_sec>(s.count());
            ts.tv_nsec = static_cast<decltype(ts.tv_nsec)>((ns-s).count());
        }
        else
        {
            ts.tv_sec = ts_sec_max;
            ts.tv_nsec = giga::num - 1;
        }

        while (nanosleep(&ts, &ts) == -1 && errno == EINTR)
            ;
    }
}

sleep_until函数用到了mutex, condition_variable, unique_lock，实际上调用的还是pthread_cond_timedwait函数：

template <class _Clock, class _Duration>
void
sleep_until(const chrono::time_point<_Clock, _Duration>& __t)
{
    using namespace chrono;
    mutex __mut;
    condition_variable __cv;
    unique_lock<mutex> __lk(__mut);
    while (_Clock::now() < __t)
        __cv.wait_until(__lk, __t);
}

void
condition_variable::__do_timed_wait(unique_lock<mutex>& lk,
     chrono::time_point<chrono::system_clock, chrono::nanoseconds> tp) _NOEXCEPT
{
    using namespace chrono;
    if (!lk.owns_lock())
        __throw_system_error(EPERM,
                            "condition_variable::timed wait: mutex not locked");
    nanoseconds d = tp.time_since_epoch();
    if (d > nanoseconds(0x59682F000000E941))
        d = nanoseconds(0x59682F000000E941);
    timespec ts;
    seconds s = duration_cast<seconds>(d);
    typedef decltype(ts.tv_sec) ts_sec;
    _LIBCPP_CONSTEXPR ts_sec ts_sec_max = numeric_limits<ts_sec>::max();
    if (s.count() < ts_sec_max)
    {
        ts.tv_sec = static_cast<ts_sec>(s.count());
        ts.tv_nsec = static_cast<decltype(ts.tv_nsec)>((d - s).count());
    }
    else
    {
        ts.tv_sec = ts_sec_max;
        ts.tv_nsec = giga::num - 1;
    }
    int ec = pthread_cond_timedwait(&__cv_, lk.mutex()->native_handle(), &ts);
    if (ec != 0 && ec != ETIMEDOUT)
        __throw_system_error(ec, "condition_variable timed_wait failed");
}

std::notify_all_at_thread_exit 的实现

先来看个例子，这个notify_all_at_thread_exit函数到底有什么用：

#include <mutex>
#include <thread>
#include <condtion_variable>

std::mutex m;
std::condition_variable cv;

bool ready = false;
ComplexType result;  // some arbitrary type

void thread_func()
{
    std::unique_lock<std::mutex> lk(m);
    // assign a value to result using thread_local data
    result = function_that_uses_thread_locals();
    ready = true;
    std::notify_all_at_thread_exit(cv, std::move(lk));
} // 1. destroy thread_locals, 2. unlock mutex, 3. notify cv

int main()
{
    std::thread t(thread_func);
    t.detach();

    // do other work
    // ...

    // wait for the detached thread
    std::unique_lock<std::mutex> lk(m);
    while(!ready) {
        cv.wait(lk);
    }
    process(result); // result is ready and thread_local destructors have finished
}

可以看到std::notify_all_at_thread_exit 函数，实际上是注册了一对condition_variable，mutex，当线程退出时，notify_all。

下面来看下具体的实现：

这个是通过Thread-specific Data来实现的，具体可以参考：http://www.ibm.com/developerworks/cn/linux/thread/posix_threadapi/part2/

但我个人觉得这个应该叫线程特定数据比较好，因为它是可以被别的线程访问的，而不是某个线程”专有“的。

简而言之，std::thread在构造的时候，创建了一个__thread_struct_imp对象。

__thread_struct_imp对象里，用一个vector来保存了pair<condition_variable*, mutex*>：

class  __thread_struct_imp
{
    typedef vector<__assoc_sub_state*,
                          __hidden_allocator<__assoc_sub_state*> > _AsyncStates;
<strong>    typedef vector<pair<condition_variable*, mutex*>,
               __hidden_allocator<pair<condition_variable*, mutex*> > > _Notify;</strong>

    _AsyncStates async_states_;
    _Notify notify_;

当调用notify_all_at_thread_exit函数时，把condition_variable和mutex，push到vector里：

void
__thread_struct_imp::notify_all_at_thread_exit(condition_variable* cv, mutex* m)
{
    notify_.push_back(pair<condition_variable*, mutex*>(cv, m));
}

当线程退出时，会delete掉__thread_struct_imp，也就是会调用__thread_struct_imp的析构函数。

而在析构函数里，会调用历遍vector，unlock每个mutex，和调用condition_variable.notify_all()函数：

__thread_struct_imp::~__thread_struct_imp()
{
    for (_Notify::iterator i = notify_.begin(), e = notify_.end();
            i != e; ++i)
    {
        i->second->unlock();
        i->first->notify_all();
    }
    for (_AsyncStates::iterator i = async_states_.begin(), e = async_states_.end();
            i != e; ++i)
    {
        (*i)->__make_ready();
        (*i)->__release_shared();
    }
}

更详细的一些封闭代码，我提取出来放到了gist上：https://gist.github.com/hengyunabc/d48fbebdb9bddcdf05e9

其它的一些东东：

关于线程的yield, detch, join，可以直接参考man文档：

pthread_yield:

       pthread_yield() causes the calling thread to relinquish the CPU.  The
       thread is placed at the end of the run queue for its static priority
       and another thread is scheduled to run.  For further details, see
       sched_yield(2)

pthread_detach:

       The pthread_detach() function marks the thread identified by thread
       as detached.  When a detached thread terminates, its resources are
       automatically released back to the system without the need for
       another thread to join with the terminated thread.

       Attempting to detach an already detached thread results in
       unspecified behavior.

pthread_join:

       The pthread_join() function waits for the thread specified by thread
       to terminate.  If that thread has already terminated, then
       pthread_join() returns immediately.  The thread specified by thread
       must be joinable.

总结：

个人感觉像 join, detach这两个函数实际没多大用处。绝大部分情况下，线程创建之后，都应该detach掉。

像join这种同步机制不如换mutex等更好。

参考：

http://en.cppreference.com/w/cpp/thread/notify_all_at_thread_exit

http://man7.org/linux/man-pages/man3/pthread_detach.3.html

http://man7.org/linux/man-pages/man3/pthread_join.3.html

http://stackoverflow.com/questions/19744250/c11-what-happens-to-a-detached-thread-when-main-exits

http://man7.org/linux/man-pages/man3/pthread_yield.3.html

http://man7.org/linux/man-pages/man2/sched_yield.2.html

http://www.ibm.com/developerworks/cn/linux/thread/posix_threadapi/part2/

man pthread_key_create

C++11的thread代码分析,布布扣,bubuko.com

时间： 2024-10-11 13:46:13

C++11的thread代码分析的相关文章

通过 thread dump 分析找到高CPU耗用与内存溢出的Java代码

http://heylinux.com/archives/1085.html通过 thread dump 分析找到高CPU耗用与内存溢出的Java代码首先,要感谢我的好朋友钊花的经验分享. 相信大家在实际的工作当中,肯定会遇到由代码所导致的高CPU耗用以及内存溢出的情况. 通常这种情况发生时,我们会认为这些问题理所当然的该由开发人员自己去解决,因为操作系统环境是没有任何问题的. 但实际上,我们是可以帮助他们的,效果好的话还可以定位到具体出问题的代码行数,思路如下: 1.通过对CPU与内存的

2018/11/08-调试器-《恶意代码分析实战》

调试器是用来检测或测试其他程序运行的以来软件或硬件.由于刚完成的程序包含错误,因此调试器在软件开发过程中可以大显身手.调试器让你能够洞察程序在执行过程中做了什么.调试器的目的是允许开发者监控程序的内部状态和运行. 从调试器获得程序的信息可能比较困难,但并不意味着不可能,可以从反汇编器中获得所需信息.反汇编会在程序执行第一条指令前,立即提供程序的快照.当程序执行时,调试器的目的是允许开发者监控程序的内部状态和运行. 调试器监控程序执行的能力在恶意代码分析过程中扮演着十分重要的角色.调试器允许你查看

性能分析之-- JAVA Thread Dump 分析综述【转】

一.Thread Dump介绍 1.1什么是Thread Dump? Thread Dump是非常有用的诊断Java应用问题的工具.每一个Java虚拟机都有及时生成所有线程在某一点状态的thread-dump的能力,虽然各个 Java虚拟机打印的thread dump略有不同,但是大多都提供了当前活动线程的快照,及JVM中所有Java线程的堆栈跟踪信息,堆栈信息一般包含完整的类名及所执行的方法,如果可能的话还有源代码的行数. 1.2 Thread Dump特点 1. 能在各种操作系统下使用 2.

基于mykernel的一个简单的时间片轮转多道程序内核代码分析

学号023作品本实验资源来源: https://github.com/mengning/linuxkernel/ 一.观察简易操作系统此处使用实验楼的虚拟机打开终端 cd LinuxKernel/linux-3.9.4 rm -rf mykernel patch -p1 < ../mykernel_for_linux3.9.4sc.patch make allnoconfig make #编译内核请耐心等待 qemu -kernel arch/x86/boot/bzImage 在QEMU窗口

【工利其器】必会工具之（八）PMD篇——代码分析工具基本使用介绍

如今,使用代码分析工具来代替人工进行代码审查,已经是大势所趋了.用于Java代码检测的工具中,不乏许许多多的佼佼者,其中PMD就是其中一款.PMD既可以独立运行,也可以以命令行的形式运行,还可以作为插件在IDE中运行,本文将基于在Android Studio中的使用来介绍PMD的基本使用. 一.PMD简介对于PMD名称含义,有个有趣的现象,PMD不存在一个准确的名称,在官网上你可以发现很有有趣的名称 ,比如:Pretty Much Done,Project Meets Deadline等.PM

《linux 内核完全剖析》 keyboard.S 部分代码分析(key_map)

keyboard.S 部分代码分析(key_map) keyboard中间有这么一段,我一开始没看明白,究竟啥意思 key_map: .byte 0,27 .ascii "1234567890-=" .byte 127,9 .ascii "qwertyuiop[]" .byte 13,0 .ascii "asdfghjkl;'" .byte '`,0 .ascii "\\zxcvbnm,./" .byte 0,'*,0,32

常用 Java 静态代码分析工具的分析与比较

转载自: http://www.oschina.net/question/129540_23043 简介: 本文首先介绍了静态代码分析的基本概念及主要技术,随后分别介绍了现有 4 种主流 Java 静态代码分析工具 (Checkstyle,FindBugs,PMD,Jtest),最后从功能.特性等方面对它们进行分析和比较,希望能够帮助 Java 软件开发人员了解静态代码分析工具,并选择合适的工具应用到软件开发中. 引言在 Java 软件开发过程中,开发团队往往要花费大量的时间和精力发现并修改代

驱动相关的内核代码分析

arch\arm\include\asm\Io.h #define __raw_readl(a) (__chk_io_ptr(a), *(volatile unsigned int __force *)(a)) #define __raw_writel(v,a) (__chk_io_ptr(a), *(volatile unsigned int __force *)(a) = (v)) 注:(volatile unsigned int __force *)指针强制转换为unsigne

NodeManager代码分析之NodeManager启动过程

1.NodeManager概述 NodeManager(NM)是YARN中每个节点上的代理,它管理Hadoop集群中单个计算节点,包括与ResourceManger保持通信,监督Container的生命周期管理,监控每个Container的资源使用(内存.CPU等)情况,追踪节点健康状况,管理日志和不同应用程序用到的附属服务. NodeManager整体架构: 2.NodeManager分析接下来将按照启动NodeManager时代码执行的顺序为主线进行代码分析. 2.1 main函数打印N