AllJoyn 的JoinSession() 返回timeout问题

在项目中AllJoyn一直有个问题困扰着我们:client在加入session时调用JoinSession()函数有时会timeout失败。

注意:是“有时”失败,而有时又运行的很好。这种不确定性问题向来都让人崩溃。

在AllJoyn官方论坛上也有人提到这个问题,他们开发者说在之前的版本都已经修复了。但之后又有人遇到这种问题,并且说使用异步的JoinSessionAsync()成功几率会高点。但经我们实际测试,JoinSessionAsync()和JoinSession()没什么区别。

前段时间的突出问题在于:DTV上运行AllJoyn,JoinSession()失败几率非常高。后来经过与AllJoyn的Sample比对代码,发现我们的LinkTimeout被设置为0了,将其改为40这个问题就暂时消失了。

(LinkTimeout为什么被设置为0?我们都没有映象,于是翻出最早的代码,发现我写的初始版本是20,经同事重构后的版本也是20,但从HQ转一圈回来后就变为0了--重要的是我们谁都没发现这个参数被改为0了。他们为什么要改这个参数?--我们不知道,也不可能知道,因为负责这个模块的人都换了好几个。。。还有笔者更悲催的吗?)

但最近的测试中这个问题又出现了,经过这么久的折腾我几乎束手无策,不知道该怎么解决这个问题--我都有了去仔细跟读AllJoyn庞大而复杂的源代码的冲动了。

还好,我发现了其他办法。

1.启用AllJoyn内部日志。

在AllJoyn初始化前调用如下代码:

QCC_UseOSLogging(true);
QCC_SetLogLevels("ALLJOYN=7;ALL=1");
QCC_SetDebugLevel("ALL", 7);
QCC_SetDebugLevel("ICE", 1);
QCC_SetDebugLevel("IPNS", 1);
QCC_SetDebugLevel("TIMER", 1);
关于这代码干啥的就不说了,翻一下AllJoyn相关源码就知道了。

然后运行程序,在logcat中查看AllJoyn输出的日志。

经过仔细查看,终于发现了端倪。

JoinSession() timeout的日志:

09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::50cc:f8ff:feac:498f
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss 109.123.117.49
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::52cc:f8ff:feac:498f
09-06 15:26:26.608: D/NETWORK(12712): 1.910 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 1, sockfd = <>)
09-06 15:26:26.608: D/NETWORK(12712): 1.910 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:195 | Connect(sockfd = 122, remoteAddr = 109.123.117.48, remotePort = 9955)
09-06 15:26:26.623: I/THREAD(12712): 1.925 DEBUG THREAD lepDisp common/os/posix/Thread.cc:220 | Thread function exited: lepDisp --> 0x0
09-06 15:26:26.623: D/THREAD(12712): 1.925 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 5aba2b88 : running]
09-06 15:26:26.623: I/THREAD(12712): 1.926 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5aba2b88] Joining thread [lepDisp - 5aba2b88]
09-06 15:26:26.623: I/THREAD(12712): 1.926 DEBUG THREAD lepDisp common/os/posix/Thread.cc:435 | Joined thread lepDisp
09-06 15:26:26.638: W/THREAD(12712): 1.943 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:170 | Thread::~Thread() destroying lepDisp - 0
09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:319 | Thread::Stop() thread is dead [lepDisp]
09-06 15:26:26.638: D/THREAD(12712): 1.943 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 0 : not running]
09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5a798120] Joining thread [lepDisp - 0]
09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:378 | Thread::Join() thread is dead [lepDisp]
09-06 15:26:26.638: W/THREAD(12712): 1.943 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:182 | Thread::~Thread() destroyed lepDisp - 0 -- started:29 running:24 joined:5
09-06 15:26:26.743: D/NETWORK(12712): 2.047 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:557 | RecvFrom(sockfd = 115, remoteAddr = <invalid IP address>, remotePort = 9956, buf = <>, len = 1454, received = <>)
09-06 15:26:26.743: I/NETWORK(12712): 2.047 DEBUG NETWORK IpNameServiceImpl common/os/posix/Socket.cc:569 | Received 119 bytes, remoteAddr = 109.123.117.48, remotePort = 9956
09-06 15:26:26.743: W/IPNS(12712): 2.047 HL_DBG IPNS IpNameServiceImpl .../IpNameServiceImpl.cc:3599 | IpNameServiceImpl::Run(): Got IPNS message from "109.123.117.48"
09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...on/ns/IpNsProtocol.cc:1576 | Header::Deserialize(): IsAt::Deserialize() answer 0
09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:686 | IsAt::Deserialize()
09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:733 | IsAt::Deserialize(): G flag 1
09-06 15:26:26.743: I/NS(12712): 2.049 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:736 | IsAt::Deserialize(): C flag 1
09-06 15:26:26.743: I/NS(12712): 2.049 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:739 | IsAt::Deserialize(): T flag 1

...

09-06 15:27:46.413: I/NS(12712): 81.719 DEBUG NS IpNameServiceImpl ...on/ns/IpNsProtocol.cc:1034 | IsAt::Deserialize(): StringData::Deserialize() name 0
09-06 15:27:46.413: I/NS(12712): 81.719 DEBUG NS IpNameServiceImpl ...emon/ns/IpNsProtocol.cc:81 | StringData::Deserialize()
09-06 15:27:46.418: I/NS(12712): 81.720 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:108 | StringData::Deserialize(): com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362 from buffer
09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3714 | TCPTransport::FoundCallback::Found(): busAddr = "r4addr=109.123.117.48,r4port=9955"
09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3792 | TCPTransport::FoundCallback::Found(): newBusAddr = "tcp:r4addr=109.123.117.48,r4port=9955".
09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3798 | TCPTransport::FoundCallback::Found(): FoundNames(): tcp:r4addr=109.123.117.48,r4port=9955
09-06 15:27:46.418: D/ALLJOYN_OBJ(12712): 81.721 TRACE ALLJOYN_OBJ IpNameServiceImpl .../daemon/AllJoynObj.cc:3495 | AllJoynObj::FoundNames(busAddr = "tcp:r4addr=109.123.117.48,r4port=9955", guid = "7e08ad9db1299e3a505206188e09b165", names = com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362, ttl = 120)
09-06 15:27:46.418: D/THREAD(12712): 81.721 TRACE THREAD IpNameServiceImpl common/os/posix/Thread.cc:333 | Thread::Alert() [NameReaper: running]
09-06 15:27:52.183: I/IFCONFIG(12712): 87.489 DEBUG IFCONFIG IpNameServiceImpl ...posix/IfConfigLinux.cc:563 | IfConfig(): The Linux way
09-06 15:27:52.193: D/NETWORK(12712): 87.495 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 2, sockfd = <>)
09-06 15:27:52.193: D/NETWORK(12712): 87.496 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:272 | Bind(sockfd = 115, localAddr = 0.0.0.0, localPort = 9956)
09-06 15:27:52.198: D/NETWORK(12712): 87.500 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:171 | Socket(addrFamily = 10, type = 2, sockfd = <>)
09-06 15:27:52.198: D/NETWORK(12712): 87.501 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:272 | Bind(sockfd = 116, localAddr = ::, localPort = 9956)
09-06 15:27:56.683: I/LOCAL_TRANSPORT(12712): 91.986 DEBUG LOCAL_TRANSPORT replyTimer .../src/LocalTransport.cc:841 | Timed out waiting for METHOD_REPLY with serial 5
09-06 15:27:56.683: W/ALLJOYN(12712): 91.987 HL_DBG ALLJOYN replyTimer ...ore/src/Message_Gen.cc:988 | MarshalMessage: 80+0 ERROR[5] org.alljoyn.Bus.Timeout
09-06 15:27:56.683: D/THREAD(12712): 91.987 TRACE THREAD replyTimer common/os/posix/Thread.cc:333 | Thread::Alert() [lepDisp: running]
09-06 15:27:56.688: W/THREAD(12712): 91.991 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:164 | Thread::Thread() created lepDisp - 0 -- started:29 running:24 joined:5
09-06 15:27:56.688: D/THREAD(12712): 91.993 TRACE THREAD lepDisp common/os/posix/Thread.cc:301 | Thread::Start() [lepDisp] pid = 5cbd8918
09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp .../src/LocalTransport.cc:465 | Pushing ERROR[5] org.alljoyn.Bus.Timeout into local endpoint
09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp .../src/LocalTransport.cc:703 | LocalEndpoint::RemoveReplyHandler for serial=5
09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp ...src/LocalTransport.cc:1018 | Matched reply for serial #5
09-06 15:27:56.688: I/THREAD(12712): 91.993 DEBUG THREAD external common/os/posix/Thread.cc:205 | Thread::RunInternal: lepDisp (pid=5cbd8918)
09-06 15:27:56.688: I/THREAD(12712): 91.994 DEBUG THREAD lepDisp common/os/posix/Thread.cc:216 | Starting thread: lepDisp
09-06 15:27:56.693: I/ALLJOYN(12712): 91.994 DEBUG ALLJOYN lepDisp ...e/src/Message_Parse.cc:647 | Unmarshaled
09-06 15:27:56.693: I/ALLJOYN(12712): <message endianness="LITTLE" type="ERROR" version="1" body_len="0" serial="6">
09-06 15:27:56.693: I/ALLJOYN(12712): <header_fields>
09-06 15:27:56.693: I/ALLJOYN(12712): <header field="ERROR_NAME">
09-06 15:27:56.693: I/ALLJOYN(12712): <string>org.alljoyn.Bus.Timeout</string>
09-06 15:27:56.693: I/ALLJOYN(12712): </header>
09-06 15:27:56.693: I/ALLJOYN(12712): <header field="REPLY_SERIAL">
09-06 15:27:56.693: I/ALLJOYN(12712): <uint32>5</uint32>
09-06 15:27:56.693: I/ALLJOYN(12712): </header>
09-06 15:27:56.693: I/ALLJOYN(12712): <header field="SENDER">
09-06 15:27:56.693: I/ALLJOYN(12712): <string>:_Eds_BwS.2</string>
09-06 15:27:56.693: I/ALLJOYN(12712): </header>
09-06 15:27:56.693: I/ALLJOYN(12712): </header_fields>
09-06 15:27:56.693: I/ALLJOYN(12712): </message>
09-06 15:27:56.693: E/ALLJOYN(12712): 91.996 ****** ERROR ALLJOYN lepDisp .../src/BusAttachment.cc:1477 | org.alljoyn.Bus.JoinSession returned ERROR_MESSAGE (error=org.alljoyn.Bus.Timeout): ER_BUS_REPLY_IS_ERROR_MESSAGE
09-06 15:27:56.693: D/THREAD(12712): 91.996 TRACE THREAD lepDisp common/os/posix/Thread.cc:333 | Thread::Alert() [lepDisp: running]
09-06 15:27:56.693: V/JACK(12712): [ContextSharing]AllJoyn::_join_session(): join session:com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362 failed.ER_BUS_REPLY_IS_ERROR_MESSAGE

...

09-06 15:29:31.303: I/RENDEZVOUS_SERVER_CONNECTION(12712): 186.606 DEBUG RENDEZVOUS_SERVER_CONNECTION DiscoveryManager ...tion.cc:72 | RendezvousServerConnection::~RendezvousServerConnection()
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.606 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:1158 | Run: Server connect return status = ER_UNABLE_TO_CONNECT_TO_RENDEZVOUS_SERVER
09-06 15:29:31.303: D/PROXIMITY_SCAN_ENGINE(12712): 186.607 TRACE PROXIMITY_SCAN_ENGINE DiscoveryManager ...ScanEngine.cc:320 | ProximityScanEngine::StopScan() called
09-06 15:29:31.303: I/PROXIMITY_SCAN_ENGINE(12712): 186.607 DEBUG PROXIMITY_SCAN_ENGINE DiscoveryManager ...ScanEngine.cc:336 | ProximityScanEngine::StopScan() completed
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3267 | DiscoveryManager::GetWaitTimeOut()
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3273 | DiscoveryManager::GetWaitTimeOut(): timeout= 0xffffffff tNow = 0x2d8ef
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3305 | DiscoveryManager::GetWaitTimeOut(): timeout = -1
09-06 15:29:36.218: E/NETWORK(12712): 191.520 ****** ERROR NETWORK JoinS-1 common/os/posix/Socket.cc:214 | Connecting (sockfd = 122) to 109.123.117.48 9955: 110 - Connection timed out: ER_OS_ERROR
09-06 15:29:36.218: E/TCP(12712): 191.520 ****** ERROR TCP JoinS-1 ...aemon/TCPTransport.cc:2760 | TCPTransport::Connect(): Failed: ER_OS_ERROR
09-06 15:29:36.218: I/ALLJOYN(12712): 191.521 DEBUG ALLJOYN JoinS-1 ...core/src/BusEndpoint.cc:45 | Invalidating endpoint type=0

JoinSesssion() 成功的日志:

09-06 17:03:39.428: W/TCP(7411): 1.731 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::50cc:f8ff:feac:498f
09-06 17:03:39.428: W/TCP(7411): 1.731 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss 109.123.117.49
09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::52cc:f8ff:feac:498f
09-06 17:03:39.428: D/NETWORK(7411): 1.732 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 1, sockfd = <>)
09-06 17:03:39.428: D/NETWORK(7411): 1.733 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:195 | Connect(sockfd = 120, remoteAddr = 109.123.117.129, remotePort = 9955)
09-06 17:03:39.438: I/THREAD(7411): 1.742 DEBUG THREAD lepDisp common/os/posix/Thread.cc:220 | Thread function exited: lepDisp --> 0x0
09-06 17:03:39.438: D/THREAD(7411): 1.742 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 5cbd2f38 : running]
09-06 17:03:39.438: I/THREAD(7411): 1.743 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5cbd2f38] Joining thread [lepDisp - 5cbd2f38]
09-06 17:03:39.438: I/THREAD(7411): 1.743 DEBUG THREAD lepDisp common/os/posix/Thread.cc:435 | Joined thread lepDisp
09-06 17:03:39.448: W/THREAD(7411): 1.754 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:170 | Thread::~Thread() destroying lepDisp - 0
09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:319 | Thread::Stop() thread is dead [lepDisp]
09-06 17:03:39.448: D/THREAD(7411): 1.754 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 0 : not running]
09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5aba3450] Joining thread [lepDisp - 0]
09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:378 | Thread::Join() thread is dead [lepDisp]
09-06 17:03:39.448: W/THREAD(7411): 1.754 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:182 | Thread::~Thread() destroyed lepDisp - 0 -- started:28 running:24 joined:4
09-06 17:03:39.498: D/NETWORK(7411): 1.802 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:467 | Send(sockfd = 120, *buf = <>, len = 1, sent = <>)
09-06 17:03:39.498: W/THREAD(7411): 1.802 HL_DBG THREAD JoinS-1 common/os/posix/Thread.cc:164 | Thread::Thread() created auth - 0 -- started:28 running:24 joined:4
09-06 17:03:39.498: I/ALLJOYN(7411): 1.802 DEBUG ALLJOYN JoinS-1 ...re/src/EndpointAuth.cc:370 | EndpointAuth::Establish authMechanisms="ANONYMOUS"
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.802 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:699 | SASL Responder mechanisms ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:275 | Responder starting auth conversation ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:294 | Current authSet ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:220 | Initialized authMechanism ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:224 | New Responder state WAIT_FOR_DATA
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:452 | Responder sending AUTH ANONYMOUS
09-06 17:03:39.498: D/NETWORK(7411): 1.803 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:467 | Send(sockfd = 120, *buf = <>, len = 16, sent = <>)
09-06 17:03:39.498: I/ALLJOYN(7411): 1.803 DEBUG ALLJOYN JoinS-1 ...re/src/EndpointAuth.cc:433 | Sent AUTH ANONYMOUS
09-06 17:03:39.498: D/NETWORK(7411): 1.803 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
09-06 17:03:39.568: D/NETWORK(7411): 1.869 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
09-06 17:03:39.568: D/NETWORK(7411): 1.870 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
09-06 17:03:39.568: D/NETWORK(7411): 1.870 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)

相关代码如下:

QStatus TCPTransport::Connect(const char* connectSpec, const SessionOpts& opts, BusEndpoint& newEp)
{
QCC_DbgHLPrintf(("TCPTransport::Connect(): %s", connectSpec));

QStatus status;
bool isConnected = false;

...

SocketFd sockFd = -1;
status = Socket(QCC_AF_INET, QCC_SOCK_STREAM, sockFd);
if (status == ER_OK) {
/* Turn off Nagle */
status = SetNagle(sockFd, false);
}

if (status == ER_OK) {
/*
* We got a socket, now tell TCP to connect to the remote address and
* port.
*/
status = qcc::Connect(sockFd, ipAddr, port);
if (status == ER_OK) {
/*
* We now have a TCP connection established, but DBus (the wire
* protocol which we are using) requires that every connection,
* irrespective of transport, start with a single zero byte. This
* is so that the Unix-domain socket transport used by DBus can pass
* SCM_RIGHTS out-of-band when that byte is sent.
*/
uint8_t nul = 0;
size_t sent;

status = Send(sockFd, &nul, 1, sent);
if (status != ER_OK) {
QCC_LogError(status, ("TCPTransport::Connect(): Failed to send initial NUL byte"));
}
isConnected = true;
} else {
QCC_LogError(status, ("TCPTransport::Connect(): Failed"));
}
} else {
QCC_LogError(status, ("TCPTransport::Connect(): qcc::Socket() failed"));
}

}

QStatus Connect(SocketFd sockfd, const IPAddress& remoteAddr, uint16_t remotePort)
{
QStatus status = ER_OK;
int ret;
struct sockaddr_storage addr;
socklen_t addrLen = sizeof(addr);

QCC_DbgTrace(("Connect(sockfd = %d, remoteAddr = %s, remotePort = %hu)",
sockfd, remoteAddr.ToString().c_str(), remotePort));

status = MakeSockAddr(remoteAddr, remotePort, &addr, addrLen);
if (status != ER_OK) {
return status;
}

ret = connect(static_cast<int>(sockfd), reinterpret_cast<struct sockaddr*>(&addr), addrLen);
if (ret == -1) {
if ((errno == EINPROGRESS) || (errno == EALREADY)) {
status = ER_WOULDBLOCK;
} else if (errno == EISCONN) {
status = ER_OK;
} else if (errno == ECONNREFUSED) {
status = ER_CONN_REFUSED;
} else {
status = ER_OS_ERROR;
QCC_LogError(status, ("Connecting (sockfd = %u) to %s %d: %d - %s", sockfd,
remoteAddr.ToString().c_str(), remotePort,
errno, strerror(errno)));
}
} else {
int flags = fcntl(sockfd, F_GETFL, 0);
ret = fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);

if (ret == -1) {
status = ER_OS_ERROR;
QCC_LogError(status, ("Connect fcntl (sockfd = %u) to O_NONBLOCK: %d - %s", sockfd, errno, strerror(errno)));
/* Higher level code is responsible for closing the socket */
}
}

return status;
}

根据日志做了一个简单的流程对比:

时间(s) JoinSession() timeout JoinSession()成功
0
开始连接
Connect(sockfd = 122, remoteAddr = 109.123.117.48, remotePort = 9955)
开始连接

Connect(sockfd = 120, remoteAddr = 109.123.117.129, remotePort = 9955)
0
连接成功,发送数据
Send(sockfd = 120, *buf = <>, len = 1, sent = <>)
0
接收数据
Recv(sockfd = 120, buf = <>, len = 1, received = <>)
90
JoinSession() timeout
LocalTransport.cc:841 | Timed out waiting for METHOD_REPLY with serial 5
BusAttachment.cc:1477 | org.alljoyn.Bus.JoinSession returned ERROR_MESSAGE
(error=org.alljoyn.Bus.Timeout): ER_BUS_REPLY_IS_ERROR_MESSAGE

190 tcp connect() timeout
Connecting (sockfd = 122) to 109.123.117.48 9955: 110 - Connection timed out: ER_OS_ERROR

通过上面对比,可以发现JoinSession() timeout的根源在于tcp的系统调用connect阻塞超时了,即如下代码:

ret = connect(static_cast<int>(sockfd), reinterpret_cast<struct sockaddr*>(&addr), addrLen);

找到问题点了,可为什么会timeout呢?根据一般的tcp编程经验,很有可能是网络或者Server端出问题了。

会有什么问题呢?不知道。

刚在Android上安装好了tcpdump,下周抓包一看就知道啦!

时间: 2024-10-12 00:19:44

AllJoyn 的JoinSession() 返回timeout问题的相关文章

timeout Timeout时间已到.在操作完成之前超时时间已过或服务器未响应

Timeout时间已到.在操作完成之前超时时间已过或服务器未响应 问题 在使用asp.net开发的应用程序查询数据的时候,遇到页面请求时间过长且返回"Timeout时间已到.在操作完成之间超时时间已过或服务器未响应"的情况 分析 造成这一问题的原因大概有以下几点:     1.Asp.net请求超时         2.Webservice请求超时          3.IIS请求超时          4.数据库连接超时 凭经验判断,应当是数据库连接超时造成,根据在网上找到的解决方法

Elasticsearch 学习笔记2 集群和数据

集群术语 - 节点: 一个elasticsearch实例(一个elasticsearch进程)就是一个节点 - 集群: 由一个或者多个elasticsearch节点组成 - 主节点: 临时管理集群级别变更:新建/删除索引,新建/移除节点,不参与文档级别变更或者搜索,当数据量增长时,不会成为集群瓶颈,集群只有一个主节点,通过各个节点选举产生 - 分片(shard):是最小级别工作单元,它只是保存了索引中所有数据的一部分 - 主分片:每个文档属于一个单独主分片,主分片数量可以在创建索引时指定,默认个

[00015]-[2015-09-04]-[01]-[WinSocket编程1 Select模型开发]

套接字Select模型是比较常用的一种I/O模型,利用该模型使得Windows Sockets应用程序可以在同一时间内管理和控制多个套接字,该模型的核心就是select()函数----调用select()函数检查当前多个套接字的状态----是否可读,可写,有异常.....根据该函数的返回值,判断套接字的可读可写性,然后调用相应的Windows Sockets API函数完成数据的发送和接收等操作.... [阻塞模式] 套接字执行I/O操作时,如果执行操作的条件没有得到满足,线程会被阻塞在该调用的

异步的 SQL 数据库封装

引言 我一直在寻找一种简单有效的库,它能在简化数据库相关的编程的同时提供一种异步的方法来预防死锁. 我找到的大部分库要么太繁琐,要么灵活性不足,所以我决定自己写个. 使用这个库,你可以轻松地连接到任何 SQL-Server 数据库,执行任何存储过程或 T-SQL 查询,并异步地接收查询结果.这个库采用 C# 开发,没有其他外部依赖. 背景 你可能需要一些事件驱动编程的背景知识,但这不是必需的. 使用 这个库由两个类组成: BLL (Business Logic Layer) 提供访问MS-SQL

Go语言TCP Socket编程

Golang的主要 设计目标之一就是面向大规模后端服务程序,网络通信这块是服务端 程序必不可少也是至关重要的一部分.在日常应用中,我们也可以看到Go中的net以及其subdirectories下的包均是"高频+刚需",而TCP socket则是网络编程的主流,即便您没有直接使用到net中有关TCP Socket方面的接口,但net/http总是用到了吧,http底层依旧是用tcp socket实现的. 网络编程方面,我们最常用的就是tcp socket编程了,在posix标准出来后,s

Python 爬虫入门(二)—— IP代理使用

上一节,大概讲述了Python 爬虫的编写流程, 从这节开始主要解决如何突破在爬取的过程中限制.比如,IP.JS.验证码等.这节主要讲利用IP代理突破. 1.关于代理 简单的说,代理就是换个身份.网络中的身份之一就是IP.比如,我们身在墙内,想要访问google.u2b.fb等,直接访问是404,所以要换个不会被墙的IP,比如国外的IP等.这个就是简单的代理. 在爬虫中,有些网站可能为了防止爬虫或者DDOS等,会记录每个IP的访问次数,比如,有些网站允许一个IP在1s(或者别的)只能访问10次等

中兴MF667S WCDMA猫Linux拨号笔记

公司最近有个国外有个项目需要用到WCDMA猫,网上简单选型了一下决定使用ZTE的型号MF667S的猫,本以为在Linux下拨号是比较简单的(之前有两款3G猫的调试经验),估计半天能搞定,结果折腾了一周才调通,记录一下调试过程中遇到的坑. 1,模式切换 由于现在的猫都有多种模式,目的是为了Windows下的小白用户第一次插入的时候以cdrom或者u盘的模式使用,安装里面自带的驱动后,再由驱动切换到猫的模式.这种方法在Windows下是很方便的,但是到了Linux下就很蛋疼了,需要借助usb_mod

国内物联网平台初探(二) ——阿里云物联网套件

架构 数据通道 为设备和物联网应用程序提供发布和接收消息的安全通道.数据通道目前支持CCP协议和MQTT协议. 用户可以基于CCP协议实现Pub/Sub异步通信,也可以使用远程调用(RPC)的通信模式实现设备端与云端的通信. 用户也可以基于开源协议MQTT协议连接阿里云IoT,实现Pub/Sub异步通信. 安全认证&权限策略 为每个设备颁发阿里云IoT的凭证,依赖凭证才能连接阿里云IoT. 提供设备级的授权粒度,任何设备必须经过授权才能对某个Topic发布订阅消息 服务端也需要经过授权才能操作其

Dwz手册的补充说明和常见问题

1.我如何在项目中使用dwz? 手册中有如下说明: 设计思路第一次打开页面时载入界面到客户端, 之后和服务器的交互只是数据交互, 不占用界面相关的网络流量. 支持HTML扩展方式来调用DWZ组件. 标准化Ajax开发, 降低Ajax开发成本. 也就是说,只需要在一个页面(通常是起始页,如index.aspx/index.php)包含框架,这里的框架是指demo中index.html页面的所有元素(<div class=”page”可自定义),完整的html结构.其它的页面只需要页面碎片,就是<