异常的段 错误 肯定是内存泻露 或者是栈溢出造成的

Stack Overflow is a community of 4.7 million programmers, just like you, helping each other.

Join them; it only takes a minute:

Sign
up

Join the Stack Overflow community to:

  1. Ask programming questions
  2. Answer and help your peers
  3. Get recognized for your expertise

Weird
SIGSEGV segmentation fault in std::string::assign() method from libstdc++.so.6


up
vote
12down
vote
favorite

5

My program recently encountered a weird segfault when running. I want to know if somebody had met this error before and how it could be fixed. Here is more info:

Basic info:

  • CentOS 5.2, kernal version is 2.6.18
  • g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50)
  • CPU: Intel x86 family
  • libstdc++.so.6.0.8
  • My program will start multiple threads to process data. The segfault occurred in one of the threads.
  • Though it‘s a multi-thread program, the segfault seemed to occur on a local std::string object. I‘ll show this in the code snippet later.
  • The program is compiled with -g, -Wall and -fPIC, and without -O2 or other optimization options.

The core dump info:

Core was generated by `./myprog‘.
Program terminated with signal 11, Segmentation fault.
#0  0x06f6d919 in __gnu_cxx::__exchange_and_add(int volatile*, int) () from /usr/lib/libstdc++.so.6
(gdb) bt
#0  0x06f6d919 in __gnu_cxx::__exchange_and_add(int volatile*, int) () from /usr/lib/libstdc++.so.6
#1  0x06f507c3 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib/libstdc++.so.6
#2  0x06f50834 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib/libstdc++.so.6
#3  0x081402fc in Q_gdw::ProcessData (this=0xb2f79f60) at ../../../myprog/src/Q_gdw/Q_gdw.cpp:798
#4  0x08117d3a in DataParser::Parse (this=0x8222720) at ../../../myprog/src/DataParser.cpp:367
#5  0x08119160 in DataParser::run (this=0x8222720) at ../../../myprog/src/DataParser.cpp:338
#6  0x080852ed in Utility::__dispatch (arg=0x8222720) at ../../../common/thread/Thread.cpp:603
#7  0x0052c832 in start_thread () from /lib/libpthread.so.0
#8  0x00ca845e in clone () from /lib/libc.so.6

Please note that the segfault begins within the basic_string::operator=().

The related code: (I‘ve shown more code than that might be needed, and please ignore the coding style things for now.)

int Q_gdw::ProcessData()
{
    char tmpTime[10+1] = {0};
    char A01Time[12+1] = {0};
    std::string tmpTimeStamp;

    // Get the timestamp from TP
    if((m_BackFrameBuff[11] & 0x80) >> 7)
    {
        for (i = 0; i < 12; i++)
        {
            A01Time[i] = (char)A15Result[i];
        }
        tmpTimeStamp = FormatTimeStamp(A01Time, 12);  // Segfault occurs on this line

And here is the prototype of this FormatTimeStamp method:

std::string FormatTimeStamp(const char *time, int len)

I think such string assignment operations should be a kind of commonly used one, but I just don‘t understand why a segfault could occurr here.

What I have investigated:

I‘ve searched on the web for answers. I looked at here. The reply says try to recompile
the program with _GLIBCXX_FULLY_DYNAMIC_STRING macro defined. I tried but the crash still happens.

I also looked at here. It also says to recompile the program with _GLIBCXX_FULLY_DYNAMIC_STRING,
but the author seems to be dealing with a different problem with mine, thus I don‘t think his solution works for me.



Updated on 08/15/2011

Hi guys, here is the original code of this FormatTimeStamp. I understand the coding doesn‘t look very nice(too many magic numbers, for instance..), but let‘s focus on the crash issue first.

string Q_gdw::FormatTimeStamp(const char *time, int len)
{
    string timeStamp;
    string tmpstring;

    if (time)  // It is guaranteed that "time" is correctly zero-terminated, so don‘t worry about any overflow here.
        tmpstring = time;

    // Get the current time point.
    int year, month, day, hour, minute, second;
#ifndef _WIN32
    struct timeval timeVal;
    struct tm *p;
    gettimeofday(&timeVal, NULL);
    p = localtime(&(timeVal.tv_sec));
    year = p->tm_year + 1900;
    month = p->tm_mon + 1;
    day = p->tm_mday;
    hour = p->tm_hour;
    minute = p->tm_min;
    second = p->tm_sec;
#else
    SYSTEMTIME sys;
    GetLocalTime(&sys);
    year = sys.wYear;
    month = sys.wMonth;
    day = sys.wDay;
    hour = sys.wHour;
    minute = sys.wMinute;
    second = sys.wSecond;
#endif

    if (0 == len)
    {
        // The "time" doesn‘t specify any time so we just use the current time
        char tmpTime[30];
        memset(tmpTime, 0, 30);
        sprintf(tmpTime, "%d-%d-%d %d:%d:%d.000", year, month, day, hour, minute, second);
        timeStamp = tmpTime;
    }
    else if (6 == len)
    {
        // The "time" specifies "day-month-year" with each being 2-digit.
        // For example: "150811" means "August 15th, 2011".
        timeStamp = "20";
        timeStamp = timeStamp + tmpstring.substr(4, 2) + "-" + tmpstring.substr(2, 2) + "-" +
                tmpstring.substr(0, 2);
    }
    else if (8 == len)
    {
        // The "time" specifies "minute-hour-day-month" with each being 2-digit.
        // For example: "51151508" means "August 15th, 15:51".
        // As the year is not specified, the current year will be used.
        string strYear;
        stringstream sstream;
        sstream << year;
        sstream >> strYear;
        sstream.clear();

        timeStamp = strYear + "-" + tmpstring.substr(6, 2) + "-" + tmpstring.substr(4, 2) + " " +
                tmpstring.substr(2, 2) + ":" + tmpstring.substr(0, 2) + ":00.000";
    }
    else if (10 == len)
    {
        // The "time" specifies "minute-hour-day-month-year" with each being 2-digit.
        // For example: "5115150811" means "August 15th, 2011, 15:51".
        timeStamp = "20";
        timeStamp = timeStamp + tmpstring.substr(8, 2) + "-" + tmpstring.substr(6, 2) + "-" + tmpstring.substr(4, 2) + " " +
                tmpstring.substr(2, 2) + ":" + tmpstring.substr(0, 2) + ":00.000";
    }
    else if (12 == len)
    {
        // The "time" specifies "second-minute-hour-day-month-year" with each being 2-digit.
        // For example: "305115150811" means "August 15th, 2011, 15:51:30".
        timeStamp = "20";
        timeStamp = timeStamp + tmpstring.substr(10, 2) + "-" + tmpstring.substr(8, 2) + "-" + tmpstring.substr(6, 2) + " " +
                tmpstring.substr(4, 2) + ":" + tmpstring.substr(2, 2) + ":" + tmpstring.substr(0, 2) + ".000";
    }

    return timeStamp;
}


Updated on 08/19/2011

This problem has finally been addressed and fixed. The FormatTimeStamp() function has nothing to do with the root cause, in fact. The segfault is caused by a writing overflow of a local char buffer.

This problem can be reproduced with the following simpler program(please ignore the bad namings of some variables for now):

(Compiled with "g++ -Wall -g main.cpp")

#include <string>
#include <iostream>

void overflow_it(char * A15, char * A15Result)
{
    int m;
    int t = 0,i = 0;
    char temp[3];

    for (m = 0; m < 6; m++)
    {
        t = ((*A15 & 0xf0) >> 4) *10 ;
        t += *A15 & 0x0f;
        A15 ++;

        std::cout << "m = " << m << "; t = " << t << "; i = " << i << std::endl;

        memset(temp, 0, sizeof(temp));
        sprintf((char *)temp, "%02d", t);   // The buggy code: temp is not big enough when t is a 3-digit integer.
        A15Result[i++] = temp[0];
        A15Result[i++] = temp[1];
    }
}

int main(int argc, char * argv[])
{
    std::string str;

    {
        char tpTime[6] = {0};
        char A15Result[12] = {0};

        // Initialize tpTime
        for(int i = 0; i < 6; i++)
            tpTime[i] = char(154);  // 154 would result in a 3-digit t in overflow_it().

        overflow_it(tpTime, A15Result);

        str.assign(A15Result);
    }

    std::cout << "str says: " << str << std::endl;

    return 0;
}

Here are two facts we should remember before going on: 1). My machine is an Intel x86 machine so it‘s using the Little Endian rule. Therefore for a variable "m" of int type, whose value is, say, 10, it‘s memory layout might be like this:

Starting addr:0xbf89bebc: m(byte#1): 10
               0xbf89bebd: m(byte#2): 0
               0xbf89bebe: m(byte#3): 0
               0xbf89bebf: m(byte#4): 0

2). The program above runs within the main thread. When it comes to the overflow_it() function, the variables layout in the thread stack looks like this(which only shows the important variables):

0xbfc609e9 : temp[0]
0xbfc609ea : temp[1]
0xbfc609eb : temp[2]
0xbfc609ec : m(byte#1) <-- Note that m follows temp immediately.  m(byte#1) happens to be the byte temp[3].
0xbfc609ed : m(byte#2)
0xbfc609ee : m(byte#3)
0xbfc609ef : m(byte#4)
0xbfc609f0 : t
...(3 bytes)
0xbfc609f4 : i
...(3 bytes)
...(etc. etc. etc...)
0xbfc60a26 : A15Result  <-- Data would be written to this buffer in overflow_it()
...(11 bytes)
0xbfc60a32 : tpTime
...(5 bytes)
0xbfc60a38 : str    <-- Note the str takes up 4 bytes.  Its starting address is **16 bytes** behind A15Result.

My analysis:

1). m is a counter in overflow_it() whose value is incremented by 1 at each for loop and whose max value is supposed not greater than 6. Thus it‘s value could be stored completely in m(byte#1)(remember it‘s Little Endian) which happens to be temp3.

2). In the buggy line: When t is a 3-digit integer, such as 109, then the sprintf() call would result in a buffer overflow, because serializing the number 109 to the string "109" actually requires 4 bytes: ‘1‘, ‘0‘, ‘9‘ and a terminating ‘\0‘. Because temp[]
is allocated with 3 bytes only, the final ‘\0‘ would definitely be written to temp3,
which is just the m(byte#1), which unfortunately stores m‘s value. As a result, m‘s value is reset to 0 every time.

3). The programmer‘s expectation, however, is that the for loop in the overflow_it() would execute 6 times only, with each time m being incremented by 1. Because m is always reset to 0, the actual loop time is far more than 6 times.

4). Let‘s look at the variable i in overflow_it(): Every time the for loop is executed, i‘s value is incremented by 2, and A15Result[i] will be accessed. However, if you compile and run this program, you‘ll see the i value finally adds up to 24, which means
the overflow_it() writes data to the bytes ranging from A15Result[0] to A15Result[23]. Note that the object str is only 16 bytes behind A15Result[0], thus the overflow_it() has "sweeped through" str and destroy it‘s correct memory layout.

5). I think the correct use of std::string, as it is a non-POD data structure, depends on that that instantiated std::string object must have a correct internal state. But in this program, str‘s internal layout has been changed by force externally. This should
be why the assign() method call would finally cause a segfault.



Update on 08/26/2011

In my previous update on 08/19/2011, I said that the segfault was caused by a method call on a local std::string object whose memory layout had been broken and thus became a "destroyed" object. This is not an "always" true story. Consider the C++ program below:

//C++
class A {
    public:
        void Hello(const std::string& name) {
           std::cout << "hello " << name;
         }
};
int main(int argc, char** argv)
{
    A* pa = NULL; //!!
    pa->Hello("world");
    return 0;
}

The Hello() call would succeed. It would succeed even if you assign an obviously bad pointer to pa. The reason is: the non-virtual methods of a class don‘t reside within the memory layout of the object, according to the C++ object model. The C++ compiler turns
the A::Hello() method to something like, say, A_Hello_xxx(A * const this, ...) which could be a global function. Thus, as long as you don‘t operate on the "this" pointer, things could go pretty well.

This fact shows that a "bad" object is NOT the root cause that results in the SIGSEGV segfault. The assign() method is not virtual in std::string, thus the "bad" std::string object wouldn‘t cause the
segfault. There must be some other reason that finally caused the segfault.

I noticed that the segfault comes from the __gnu_cxx::__exchange_and_add() function, so I then looked into its source code in this
web page
:

00046   static inline _Atomic_word
00047   __exchange_and_add(volatile _Atomic_word* __mem, int __val)
00048   { return __sync_fetch_and_add(__mem, __val); }

The __exchange_and_add() finally calls the __sync_fetch_and_add(). According to this
web page
, the __sync_fetch_and_add() is a GCC builtin function whose behavior is like this:

type __sync_fetch_and_add (type *ptr, type value, ...)
{
    tmp = *ptr;
    *ptr op= value; // Here the "op=" means "+=" as this function is "_and_add".
    return tmp;
}

There it is! The passed-in ptr pointer is dereferenced here. In the 08/19/2011 program, the ptr is actually the "this" pointer of the "bad" std::string object within the assign() method. It is the derefenence at this point that actually caused the SIGSEGV segmentation
fault.

We could test this with the following program:

#include <bits/atomicity.h>

int main(int argc, char * argv[])
{
    __sync_fetch_and_add((_Atomic_word *)0, 10);    // Would result in a segfault.

    return 0;
}

linux string segmentation-fault sigsegv assign


shareimprove
this question

edited Aug
26 ‘11 at 15:31

asked Aug 12 ‘11 at 9:26

yaobin

6452827

 

3  

Can you show us the relevant parts of FormatTimeStamp ? – cnicutar Aug
12 ‘11 at 9:31 
 

@cnicutar: Hi, I‘ve pasted the code above. – yaobin Aug
15 ‘11 at 8:29
 

@yaobin Thank for your detail explanation. Can you also include how you solve the problem? – Anh
Tuan
 Jul
16 ‘15 at 1:52
 

@AnhTuan: Oh this is a message I posted years ago.. I don‘t remember exactly how I resolved this. Probably just give
larger space for the buffer used in sprintf(). – yaobin Jul
16 ‘15 at 4:06
 

@yaobin: Never mind, I already solved my problem. Base on your hint about memory error, I re-checked my code and found
out I had a cast of long& on an int variable, and it broke the next string variable, thus the program crashed when I invoked assign on that string variable. The code worked fine on 32 bit OS before, but crashed on 64 bit OS. My bad :D – Anh
Tuan
 Jul
16 ‘15 at 4:26

add
a comment

2 Answers

activeoldestvotes


up vote2down
vote
accepted

There are two likely possibilities:

  • some code before line 798 has corrupted the local tmpTimeStamp object
  • the return value from FormatTimeStamp() was
    somehow bad.

The _GLIBCXX_FULLY_DYNAMIC_STRING is
most likely a red herring and has nothing to do with the problem.

If you install debuginfo package
for libstdc++ (I
don‘t know what it‘s called on CentOS), you‘ll be able to "see into" that code, and might be able to tell whether the left-hand-side (LHS) or the RHS of the assignment operator caused the problem.

If that‘s not possible, you‘ll have to debug this at the assembly level. Going into frame #2 and
doing x/4x
$ebp
 should give you previous ebp,
caller address (0x081402fc),
LHS (should match &tmpTimeStamp in
frame #3),
and RHS. Go from there, and good luck!


shareimprove
this answer

answered Aug 12 ‘11 at 15:06

Employed Russian

76.1k1295151

 

 

Russion: Thanks for your reply, but after I obtain the addresses of the LHS and RHS objects, how could I examine the
internal info of them? Do I need to cast the address to a (std::string *) in order to look into? – yaobin Aug
15 ‘11 at 8:17
 

Yes, print
*(std::string *)0x7fff74320
 or some such. – Employed
Russian
 Aug
15 ‘11 at 16:11
 

@Russian: Thanks for your help. I finally fixed this issue. You can see my update on 08/19/2011 for all the info. – yaobin Aug
19 ‘11 at 5:44

add
a comment


up vote2down
vote

I guess there could be some problem inside FormatTimeStamp function,
but without source code it‘s hard to say anything. Try to check your program under Valgrind. Usually this helps to fix such sort of bugs.


shareimprove
this answer
时间: 2024-10-27 00:24:47

异常的段 错误 肯定是内存泻露 或者是栈溢出造成的的相关文章

Linux环境下段错误的产生原因及调试方法小结(转)

最近在Linux环境下做C语言项目,由于是在一个原有项目基础之上进行二次开发,而且 项目工程庞大复杂,出现了不少问题,其中遇到最多.花费时间最长的问题就是著名的“段错误”(Segmentation Fault).借此机会系统学习了一下,这里对Linux环境下的段错误做个小结,方便以后同类问题的排查与解决. 1. 段错误是什么 一句话来说,段错误是指访问的内存超出了系统给这个程序所设定的内存空间,例如访问了不存在的内存地址.访问了系统保护的内存地址.访问了只读的内存地址等等情况.这里贴一个对于“段

Linux环境下段错误的产生原因及调试方法小结

最近在Linux环境下做C语言项目,由于是在一个原有项目基础之上进行二次开发,而且项目工程庞大复杂,出现了不少问题,其中遇到最多.花费时间最长的问题就是著名的“段错误”(Segmentation Fault).借此机会系统学习了一下,这里对Linux环境下的段错误做个小结,方便以后同类问题的排查与解决. 1. 段错误是什么 一句话来说,段错误是指访问的内存超出了系统给这个程序所设定的内存空间,例如访问了不存在的内存地址.访问了系统保护的内存地址.访问了只读的内存地址等等情况.这里贴一个对于“段错

【转】【调试技巧】Linux环境下段错误的产生原因及调试方法小结

本文转自:http://www.cnblogs.com/panfeng412/archive/2011/11/06/segmentation-fault-in-linux.html 最近在Linux环境下做C语言项目,由于是在一个原有项目基础之上进行二次开发,而且项目工程庞大复杂,出现了不少问题,其中遇到最多.花费时间最长的问题就是著名的“段错误”(Segmentation Fault).借此机会系统学习了一下,这里对Linux环境下的段错误做个小结,方便以后同类问题的排查与解决. 1. 段错误

Linux环境下段错误的产生原因及调试方法小结(转载)

转载自http://www.cnblogs.com/panfeng412/archive/2011/11/06/2237857.html 最近在Linux环境下做C语言项目,由于是在一个原有项目基础之上进行二次开发,而且项目工程庞大复杂,出现了不少问题,其中遇到最多.花费时间 最长的问题就是著名的“段错误”(Segmentation Fault).借此机会系统学习了一下,这里对Linux环境下的段错误做个小结,方便以后同类问题的排查与解决. 1. 段错误是什么 一句话来说,段错误是指访问的内存超

段错误详解

Linux环境下段错误的产生原因及调试方法小结 最近在Linux环境下做C语言项目,由于是在一个原有项目基础之上进行二次开发,而且项目工程庞大复杂,出现了不少问题,其中遇到最多.花费时间最长的问题就是著名的“段错误”(Segmentation Fault).借此机会系统学习了一下,这里对Linux环境下的段错误做个小结,方便以后同类问题的排查与解决. 1. 段错误是什么 一句话来说,段错误是指访问的内存超出了系统给这个程序所设定的内存空间,例如访问了不存在的内存地址.访问了系统保护的内存地址.访

多线程中快速定位段错误位置

参考链接:https://blog.csdn.net/u011426247/article/details/79736111 在做嵌入式Linux开发的时候,程序很容易出现段错误.段错误一般是内存操作指针出错或是内存溢出等问题,有的时候系统会有一点错误提示,但有的时候就直接提示个Segmentation fault (core dumped) .如果程序是单线程,那很好处理,编译的时候添加参数-g  ,直接使用gdb 单步调试就可以直接定位到问题点在哪了.但是对于多线程,情况就不一样了.多线程进

关于c语言内存分配,malloc,free,和段错误,内存泄露

1.   C语言的函数malloc和free (1) 函数malloc和free在头文件<stdlib.h>中的原型及参数        void * malloc(size_t size) 动态配置内存,大小有size决定,返回值成功时为任意类型指针,失败时为NULL. void  free(void *ptr) 释放动态申请的内存空间,调用free()后ptr所指向的内存空间被收回,如果ptr指向未知地方或者指向的空间已被收回,则会发生不可预知的错误,如果ptr为NULL,free不会有任

Rust语言——无虚拟机、无垃圾收集器、无运行时、无空指针/野指针/内存越界/缓冲区溢出/段错误、无数据竞争

2006年,编程语言工程师Graydon Hoare利用业余时间启动了Rust语言项目.该项目充分借鉴了C/C++/Java/Python等语言的经验,试图在保持良好性能的同时,克服以往编程语言所存在的问题.其最大的特点在于保持较高的运行效率.深入的底层控制和广泛应用范围的同时,解决了传统C语言和C++语言中的内存安全问题.2009年,Mozilla接手Rust项目,创建了以Graydon为首的专业全职开发团队,并且开放了该项目的源代码.2012年1月,第一个面向公众的预览版本--v0.1 发布

C&amp;C++——段错误(Segmentation fault)

C/C++中的段错误(Segmentation fault) Segment fault 之所以能够流行于世,是与Glibc库中基本所有的函数都默认型参指针为非空有着密切关系的.来自:http://oss.lzu.edu.cn/blog/article.php?uid_7/tid_700.html#comment 背景 最近一段时间在linux下用C做一些学习和开发,但是由于经验不足,问题多多.而段错误就是让我非常头痛的一个问题.不过,目前写一个一千行左右的代码,也很少出现段错误,或者是即使出现