在排查业务 bug 的过程中,看到如下两种输出信息:
TCP 连接正常情况下,进行数据读取
14:00:38 epoll_ctl(26, EPOLL_CTL_MOD, 31, {EPOLLIN, {u32=31, u64=31}}) = 0 14:00:38 epoll_wait(26, {{EPOLLIN, {u32=31, u64=31}}}, 32, 9698) = 1 14:00:38 clock_gettime(CLOCK_MONOTONIC, {152386, 122371397}) = 0 14:00:38 gettimeofday({1445666438, 713982}, NULL) = 0 14:00:38 ioctl(31, FIONREAD, [16]) = 0 14:00:38 readv(31, [{"\0224Vx\0\0\0\3\0\0\0\0\0\0\0\0", 16}], 1) = 16
TCP 连接断开情况下(人为强制断开),进行数据读取
14:00:38 epoll_ctl(26, EPOLL_CTL_ADD, 31, {EPOLLIN, {u32=31, u64=31}}) = 0 14:00:38 epoll_wait(26, {{EPOLLIN, {u32=31, u64=31}}}, 32, 10000) = 1 14:00:43 clock_gettime(CLOCK_MONOTONIC, {152390, 497764210}) = 0 14:00:43 gettimeofday({1445666443, 89440}, NULL) = 0 发现此时 socket 中可读数据为 0 14:00:43 ioctl(31, FIONREAD, [0]) = 0 这里会看到一些乱七八糟的数据(应该是由于 readv 用于保存读取结果的 buffer 没有清空的缘故) 14:00:43 readv(31, [{"[pid]:25002 [UpuWrapper]: [UpuClientMsgCallBack] ClientSeesion:0x94f50f8 recive upu result msg:3, result type:0, nResultCount:0 \n\0\2\[email protected]\0\320W\371\10\2\[email protected]\0\[email protected]\0\320Wdows\",\"status\":\"callidle\",\"nuaddr\":\"172.16.185.135\",\"mtaddr\":\"172.16.72.105\",\"userdomain\":\"0q0y1aphxubof8ycru6fsgsk\",\"devid\":\"10103000000002000000000000000002\",\"data\":\"\"}]\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\2\0\0\231\7\2\0\0\0\0\0/localtime\0\0\10\2\0\0\201\7\2\0\0\0\0\0domain\00005\0\0\0 \2\0\0$\0\0\0008\f\320W\0\0\0\0\10\0\0\0\1\0\0\0\300\10\320W\0\0\0\[email protected]\2\0\0,\0\0\0\340\10\320W\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0h\2\0\0,\0\0\0\350\f\320W\0\0\0\0\n\0\0\0h\v\320W\1\0\0\0D\f\320W4\v\320W\n\0\0\0\220\2\0\0004\0\0\0\270\t\320W4\v\320W\344\n\320W\344\n\320W|\n\320W4\n\320WT\t\320WT\t\320W\374\10\320W\374\10\320W\0\0\0\0\201\0\0\[email protected]\0\[email protected]\0\320W\20\0\0\0,\0\0\0\260\f\320W4\n\320W\204\f\320W\250\n\320W\2\0\0\0prototype\0001\0\0\0\0\0I\0\0\0\200\0\320W\200\0\320W0000002000000000000000002\"\0ff\"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\[email protected]\3\0\0004\0\0\0\0\0\0\0c52a-deb2-499c-add5-db70f1f62bff\0\0\0\0\0\0\0\0\31\6\2\[email protected]\0\320Wx\6\320W\20\0\0\0004\0\0\0\10\t\320Wc52a-deb2-499c-add5-db70f1f62bff\0\0\0\0\0\0\0\0\331\5\2\0x\6\320W8\t\320W\0\0\0\0\0\0\0\0\0\0\0\0moid\0\0\0\0)\0\0\0H\v\320Wx\6\320W\20\0\0\0\34\0\0\0\200\10\320W110000423\0\0\0H\0\0\0$\0\0\0(\v\320W4\v\320W"..., 4096}], 1) = 0
两种情况的对比图如下(为了显示方便,右侧图中多余数据已删除)
由图中所示,可以得到如下结论:
- ioctl 能够判定 socket 缓冲区中可读数据的数量,两种情况下 ioctl 调用都返回 0 ,即成功。
- readv 在两种情况都进行了数据读取,TCP 链路正常情况下,readv 返回读取的数据字节数;TCP 链路异常情况下,readv 返回 0 。
libevent 中的对应代码实现如下:
按理说,这个问题到这里应该算结束了,但是不幸的是,我又搜到了下面这个帖子:
=== 我是琅琊榜的分隔线 ===
Linux - ioctl with FIONREAD always 0
问题:
I‘m trying to get to know how many bytes there are readable at my TCP socket. I am calling ioctl with the Flag "FIONREAD" which should actually give me this value. When I call the function I get as return val 0 ( so no Error ) but also my integer argument gets the value 0. That would be no problem but when I call the recv() method I actually read some Bytes out of the socket. What am I doing wrong?
尝试获取 TCP socket 上有多少字节数据可读,调用使用 ioctl 配合 FIONREAD 进行获取。当调用 ioctl 后得到返回值 0 表明没有错误发生,但是保存当前可读字节数量的变量内容同样为 0 。这种情况数据正常行为,但是当代码中实际执行 recv() 调用时,发现能够从当前 socket 上读出若干字节来。哪里有问题呢?
// here some Code: char recBuffer[BUFFERLENGTH] = {0}; int bytesAv = 0; int bytesRead = 0; int flags = 0; if ( ioctl (m_Socket,FIONREAD,&bytesAv) < 0 ) { // Error } if ( bytesAv < 1 ) { // No Data Available } bytesRead = recv(m_Socket,recBuffer,BUFFERLENGTH,flags);
When I call the recv function i acutally read some valid Data ( which I expected )
有人回答如下:
The real answer here is to use select(2) like cnicutar said. Toby, what you aren‘t understanding is that you have a race condition. First you look at the socket and ask how many bytes are there. Then, while your code is processing the "no data here" block, bytes are being received by the hardware & OS asynchronous to your application. So, by the time that the recv() function is called, the answer of "no bytes are available" is no longer true...
问题的真正原因是,上述代码调用存在竞争问题。首先,你通过调用 ioctl 查看 socket 中是否存在可读字节,再你查看的那该时刻,可能恰好确实没有数据可读;而之后,在代码进行“无数据存在”逻辑处理时,可能已经经由硬件和操作系统异步的向应用程序传输数据了。所以,当代码执行到 recv() 函数时,“无数据存在”这个结论可能已经不正确了。
if ( ioctl (m_Socket,FIONREAD,&bytesAv) < 0 ) { // Error } // BYTES MIGHT BE RECEIVED BY HARDWARE/OS HERE! if ( bytesAv < 1 ) // AND HERE! { // No Data Available // BUT BYTES MIGHT BE RECEIVED BY HARDWARE/OS HERE! } // AND MORE BYTES MIGHT BE RECEIVED BY HARDWARE/OS HERE! bytesRead = recv(m_Socket,recBuffer,BUFFERLENGTH,flags); // AND NOW bytesRead IS NOT EQUAL TO 0!
Sure, a small sleep probably fixed your program two years ago, but it also taught you terrible coding practice and you lost out on an opportunity to learn how to use sockets correctly by using select().
通过调用 sleep 进行短暂的休眠可以简单粗暴的修复次问题,但是这绝非正确的编程实践,正确的作为应该是基于 select 进行处理。
Further, as Karoly Horvath said, you can tell recv to not read more bytes than you can store in the buffer that the user passed in. Then your function interface becomes "This fn will return as many bytes as are available on the socket, but not more than [buffer size you passed in]".
更进一步,你可以令 recv() 至多读取指定 buffer 长度的数据,故函数接口行为就变成了“该函数将返回当前 socket 上可读尽可能多的数据,但是不超过传入 buffer 的长度”。
This means that this function doesn‘t need to worry about clearing the buffer any more. The caller can call your function as many times as necessary to clear all of the bytes out of it (or you can provide a separate fn that discards the data wholesale and not tie up that functionality in any specific data gather function). Your function is more flexible by not doing too many things. You can then create a wrapper function that is smart to your data transfer needs of a particular application, and that fn calls the get_data fn and the clear_socket fn as needed for that specific app. Now you are building a library you can carry around from project to project, and maybe job to job if you‘re so lucky as to have an employer that lets you take code with you.
这就意味着,该函数将不再需要担心 buffer 内容清除的问题。调用者可以任意次数调用该函数,因为其自带数据清除效果(或者你可以提供单独的函数进行数据清除动作,而不将此功能绑定到任何特定的数据获取函数上)。你的函数将更加灵活,因为其不用负责过多工作。你可以创建一个更符合你要的封装函数,以针对特定应用中数据传输的需要,而该封装函数内部会调用 get_data 和 clear_socket 函数进行相应处理。通过这种方式,你就可以得到一个可以在各种工程中随意使用的库了。
=== 我是琅琊榜的分隔线 ===
乍一看,似乎帖子中描述的情况和上面 TCP 链路断开时的情况类似,但事实上是不同的,关键在于 readv 的返回值为 0 ,至于 readv 第二个参数 const struct iovec *iov 的内容,实际为 IOV_TYPE vecs[0] 所指向内存地址中的内容,并非从 socket 上读取的内容。 至于 libevent 中使用 ioctl 时是否存在 race condition 问题,其实多虑了,提供代码片段如下: