1、从main日志中找到异常点,例如以下:
08-20 11:05:19.754 5023 5023 E AndroidRuntime: FATAL EXCEPTION: main 08-20 11:05:19.754 5023 5023 E AndroidRuntime: Process: com.android.bluetooth, PID: 5023 08-20 11:05:19.754 5023 5023 E AndroidRuntime: java.lang.RuntimeException: Unable to start receiver com.android.bluetooth.opp.BluetoothOppHandoverReceiver: java.lang.RuntimeException: Adding window failed 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.app.ActivityThread.handleReceiver(ActivityThread.java:2913) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.app.ActivityThread.access$1700(ActivityThread.java:177) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1611) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:111) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.os.Looper.loop(Looper.java:194) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:5733) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at java.lang.reflect.Method.invoke(Method.java:372) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:959) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:754) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: Caused by: java.lang.RuntimeException: Adding window failed 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.view.ViewRootImpl.setView(ViewRootImpl.java:668) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.view.WindowManagerGlobal.addView(WindowManagerGlobal.java:289) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.view.WindowManagerImpl.addView(WindowManagerImpl.java:85) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.app.Dialog.show(Dialog.java:311) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.app.AlertDialog.show(AlertDialog.java:1127) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at com.android.bluetooth.opp.MzBluetoothTurnOffPromptDialog.showDialog(MzBluetoothTurnOffPromptDialog.java:64) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at com.android.bluetooth.opp.BluetoothOppHandoverReceiver.onReceive(BluetoothOppHandoverReceiver.java:251) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.app.ActivityThread.handleReceiver(ActivityThread.java:2906) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: ... 9 more 08-20 11:05:19.754 5023 5023 E AndroidRuntime: Caused by: android.os.TransactionTooLargeException 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.os.BinderProxy.transactNative(Native Method) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.os.BinderProxy.transact(Binder.java:504) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.view.IWindowSession$Stub$Proxy.addToDisplay(IWindowSession.java:768) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: at android.view.ViewRootImpl.setView(ViewRootImpl.java:657) 08-20 11:05:19.754 5023 5023 E AndroidRuntime: ... 16 more
能够看出是由于binder通信是抛了一个TransactionTooLargeException异常导致,这时我们须要借助Kernel日志进一步定位问题。
2、从Kernel日志中搜索keyword“5023”和"binder",例如以下:
Line 5: Line 18110: <6>[72111.734216]<0> (2)[1546:Binder_9]binder: 812:1546 to 5023 failed due to 5023 no unused fd available(5023:droid.bluetooth fd leak?), -24 Line 5: Line 18110: <6>[72111.734216]<0> (2)[1546:Binder_9]binder: 812:1546 to 5023 failed due to 5023 no unused fd available(5023:droid.bluetooth fd leak?), -24 Line 6: Line 18110: <6>[72111.734216]<0> (2)[1546:Binder_9]binder: 812:1546 to 5023 failed due to 5023 no unused fd available(5023:droid.bluetooth fd leak?), -24 Line 6: Line 18110: <6>[72111.734216]<0> (2)[1546:Binder_9]binder: 812:1546 to 5023 failed due to 5023 no unused fd available(5023:droid.bluetooth fd leak? ), -24 Line 7: Line 18110: <6>[72111.734216]<0> (2)[1546:Binder_9]binder: 812:1546 to 5023 failed due to 5023 no unused fd available(5023:droid.bluetooth fd leak?), -24 Line 7: Line 18110: <6>[72111.734216]<0> (2)[1546:Binder_9]binder: 812:1546 to 5023 failed due to 5023 no unused fd available(5023:droid.bluetooth fd leak? ), -24 Line 8: Line 18112: <6>[72111.734239]<0> (2)[1546:Binder_9]binder: send failed reply for transaction 5183251 to 5023:5023 Line 8: Line 18112: <6>[72111.734239]<0> (2)[1546:Binder_9]binder: send failed reply for transaction 5183251 to 5023:5023 Line 9: Line 18112: <6>[72111.734239]<0> (2)[1546:Binder_9]binder: send failed reply for transaction 5183251 to 5023:5023 Line 9: Line 18112: <6>[72111.734239]<0> (2)[1546:Binder_9]binder: send failed reply for transaction 5183251 to 5023:5023 Line 10: Line 18114: <3>[72111.734447]<0> (1)[5023:droid.bluetooth]binder: read put err 29201 to user 00000000f4f2f708, thread error 29201:29185 Line 21: Line 26440: <7>[72131.351513]<1>-(2)[1634:Binder_C][1634:Binder_C] sig 9 to [5023:droid.bluetooth] stat=x Line 21: Line 26440: <7>[72131.351513]<1>-(2)[1634:Binder_C][1634:Binder_C] sig 9 to [5023:droid.bluetooth] stat=x
能够看出是binder通信reply时报了异常。无法安装句柄。
相应代码:
驱动Binder.c文件里
target_fd = task_get_unused_fd_flags(target_proc, O_CLOEXEC); if (target_fd < 0) { fput(file); #ifdef MTK_BINDER_DEBUG binder_user_error("%d:%d to %d failed due to %d no unused fd available(%d:%s fd leak?), %d\n", proc->pid, thread->pid, target_proc->pid, target_proc->pid, target_proc->pid, target_proc->tsk ? target_proc->tsk->comm : "", target_fd); #endif return_error = BR_FAILED_REPLY; goto err_get_unused_fd_failed; } task_fd_install(target_proc, target_fd, file);
3、从上面的日志我们能够看出是因为Binder驱动为蓝牙进程安装文件句柄时报了一个error=24。这个类型错误表示caller进程打开的文件句柄过多。无法再创建句柄。这一次Binder通信中传递的句柄无法安装到caller进程中,进而Binder通信失败。但上层对binder驱动返回的错误类型没有做特别的细分处理,直接抛了一个TransactionTooLargeException异常。所以普通情况下报TransactionTooLargeException类型错误时是须要借助Kernel日志进一步定位。非常多人以为报TransactionTooLargeException就表示在这一次Binder传输中数据过大,事实上并非这样。
4、为什么addToDisplay时会传递句柄呢?我们知道上层应用加入窗体到WindowManagerService中是调addToDisplay接口来完毕的,WindowManagerService为会为每一个新加入的窗体创建一个socket,产生两个句柄。各自是socket读端和写端,socket写端会传递给InputDispatcher,socket读端会在这次Binder传输reply时返回给应用进程。
5、接下来的事情就是查句柄泄露了,可用以下两个命令。做压力測试。然后隔一段时间就“ls -l”一下,看哪个句柄在不停地涨。
cd /proc/pid/fd/ ls -l
6、最后发现是蓝牙传输小于3M的文件会产生socket泄露。