关于call_rcu在内核模块退出时可能引起kernel panic的问题

http://paulmck.livejournal.com/7314.html

RCU的作者,paul在他的blog中有提到这个问题,也明确提到需要在module exit的地方使用rcu_barrier来等待保证call_rcu的回调函数callback能够执行完成,然后再正式卸载模块,方式快速卸载之后call_back回调发现空指针的问题,从而导致kernel panic的问题。

RCU and unloadable modules

  • Jun. 8th, 2009 at 1:38 PM

The rcu_barrier() function was described some time back in an article on Linux Weekly News. This rcu_barrier() function solves the problem where a given module invokes call_rcu() using a function in that module, but the module is removed before the corresponding grace period elapses, or at least before the callback can be invoked. This results in an attempt to call a function whose code has been removed from the Linux kernel. Oops!!!

Since the above article was written, rcu_barrier_bh() and rcu_barrier_sched() have been accepted into the Linux kernel, for use with call_rcu_bh() and call_rcu_sched(), respectively. These functions have seen relatively little use, which is no surprise, given that they are quite specialized. However, Jesper Dangaard recently discovered that they need to be used a bit more heavily. This lead to the question of exactly when they needed to be used, to which I responded as follows:

Unless there is some other mechanism to ensure that all the RCU callbacks have been invoked before the module exit, there needs to be code in the module-exit function that does the following:

  1. Prevents any new RCU callbacks from being posted. In other words, make sure that no future call_rcu()invocations happen from this module unless those call_rcu() invocations touch only functions and data that outlive this module.
  2. Invokes rcu_barrier().
  3. Of course, if the module uses call_rcu_sched() instead of call_rcu(), then it should invoke rcu_barrier_sched() instead of rcu_barrier(). Similarly, if it uses call_rcu_bh() instead of call_rcu(), then it should invoke rcu_barrier_bh() instead of rcu_barrier(). If the module uses more than one of call_rcu()call_rcu_sched(), and call_rcu_bh(), then it must invoke more than one of rcu_barrier()rcu_barrier_sched(), and rcu_barrier_bh().

What other mechanism could be used? I cannot think of one that it safe. For example, a module that tried to count the number of RCU callbacks in flight would be vulnerable to races as follows:

  1. CPU 0: RCU callback decrements the counter.
  2. CPU 1: module-exit function notices that the counter is zero, so removes the module.
  3. CPU 0: attempts to execute the code returning from the RCU callback, and dies horribly due to that code no longer being in memory.

If there was an easy solution (or even a hard solution) to this problem, then I do not believe that Nikita Danilov would have asked Dipankar Sarma for rcu_barrier(). Therefore, I do not expect anyone to be able to come up with an alternative to rcu_barrier() and friends. Always happy to learn something by being proven wrong, of course!!!

So unless someone can show me some other safe mechanism, every unloadable module that uses call_rcu()call_rcu_sched(), or call_rcu_bh() must use rcu_barrier()rcu_barrier_sched(), and/or rcu_barrier_bh() in its module-exit function.

So if you have a module that uses one of the call_rcu() functions, please use the corresponding rcu_barrier()function in the module-exit code!

Update: Peter Zijlstra rightly points out that the issue is not whether your module invokes call_rcu(), but rather whether the corresponding RCU callback invokes a function that is in a module. So, if there is a call_rcu()call_rcu_sched(), or call_rcu_bh() anywhere in the kernel whose RCU callback either directly or indirectly invokes a function in your module, then your module‘s exit function needs to invoke rcu_barrier()rcu_barrier_sched(), and/or rcu_barrier_bh(). Thanks to Peter for pointing this out!

时间: 2024-10-18 08:17:07

关于call_rcu在内核模块退出时可能引起kernel panic的问题的相关文章

Android设置Activity启动和退出时的动画

业务开发时遇到的一个小特技,要求实现Activity启动时自下向上弹出,退出时自上向下退出. 此处不关注启动和退出时其他Activity的动画效果,实现方法有两种: 1.代码方式,通过Activity的overridePendingTransition接口, 即在startActivity时调用overridePendingTransition(R.anim.push_bottom_in, 0) 在finish时调用overridePendingTransition(0, R.anim.push

win7 64 下 VS2008 调试、退出时错误的解决

最近调试老程序的时候发现原来的VS2008会偶尔在调试C++程序的时候出现程序未响应的情况,开始还以为是个案,后来出现的频率越来越高完全影响心情啊!! 准备花时间解决一下这个问题.网上搜索没有发现任何有价值线索,于是决定用上绝招--安装盘修复,但结果依然是那样.准备用process monitor监测一下,发现信息太多基本不太可能一一人工分析. 在解决问题的时候还发现了一个问题.启动VS2008后,打开一个项目,不做任何修改点全部保存,然后关闭VS2008,此时VS2008会出现异常100%可重

os.waitpid()无法获取sys.exit()退出时的status code

[目的] 父进程使用os.waitpid()等待子进程退出,并检测子进程的exit code,以决定是否重启子进程. (常见的应用场景是:子进程接收外部命令,收到"stop"时退出所有进程,终止服务:收到"restart"时所有子进程退出,父进程重启所有子进程,以达到重启服务的目的). 这里面的关键点在于,子进程退出时设置exit code,父进程waitpid时获取该exit code,进而决定是否需要重启子进程. [问题] 子进程 ...#need restar

[转载]DllMain中不当操作导致死锁问题的分析--线程退出时产生了死锁

(转载于breaksoftware的csdn博客) 我们回顾下之前举得例子 case DLL_PROCESS_ATTACH: { printf("DLL DllWithoutDisableThreadLibraryCalls_A:\tProcess attach (tid = %d)\n", tid); HANDLE hThread = CreateThread(NULL, 0, ThreadCreateInDllMain, NULL, 0, NULL); WaitForSingleO

Qt 程序退出时断言错误——_BLOCK_TYPE_IS_VALID(pHead->nBlockUse),由setAttribute(Qt::WA_DeleteOnClose)引起

最近在学习QT,自己仿写了一个简单的QT绘图程序,但是在退出时总是报错,断言错误: 报错主要问题在_BLOCK_TYPE_IS_VALID(pHead->nBlockUse),是在关闭窗口时报的错: 先前考虑是析构函数有问题,重写并且排查相关变量并未发现问题. 根据报错问题又推测栈调用出现内存溢出,寻找程序中所有的链表和栈调用.未果. 最后用了最笨的方法,将构造函数中所有变量挨个遍历,最终确定问题出在Qt的setAttribute(Qt::WA_DeleteOnClose)这行代码上. 网上查询

关于解决微博登录在手机端无法实现及微博退出时FC的问题

在第三方登录界面重写: /** * 当 SSO 授权 Activity 退出时,该函数被调用. * * @see {@link Activity#onActivityResult} */ @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); // SSO 授权回调

在应用非正常退出时利用StateSaver来保存我们的数据

我们知道Ubuntu手机平台是一个单任务的系统.一个用户可以开启很多个应用,但是只有前台的应用是可以正在运行的应用.很多被推到后台的应用被驻存到内存中.如果有很多这样的被驻存的应用的话,内存迟早会被用完的.操作系统可以选择一些应用被杀死从而保证系统的正常运行.为了能够保证应用在退出时的状态,在Ubuntu系统上,我们设计了StateSaver这样的一个接口.它可以用来帮我们保存应用在非正常退出的状态,以便在应用重新启动后恢复以前的状态. 参照文章的设计,我们设计了如下的代码: import Qt

Android 编程下 Activity 的创建和应用退出时的销毁

为了确保对应用中 Activity 的创建和销毁状态进行控制,所以就需要一个全局的变量来记录和销毁这些 Activity.这里的大概思路是写一个类继承 Application,并使获取该 Application 的实体为单例模式,在新的 Activity 被创建时在对应 Activity 的 onCreate 方法中将自己存入 Application 的集合中,然后在应用退出时将 Application 存有的 Activity 逐个进行销毁即可完全退出应用.代码如下: package cn.s

自定义Toast、程序退出时Toast也退出、Toast的用法

http://blog.csdn.net/wangqilin8888/article/details/7464806 当我们在一个应用中用到Toaster来做为提示时,发现这样一个问题,当某个条件服合时,会弹出Toaster的对话框,不停地执行这个条件,会不停进行Toaster.show的显示,执行几次就现示几次,即使这个应用程序退出也会不停地Toast.show地显示,这样一来会给用户带来一种不好体验.当我们将应用程序退出了,就不应该Toast.show显示了. 我们可以在应用程序退出onDe