在linux系统中,单处理器也是多线程处理信号、事件等。这就需要一个核心算法来进行进程调度。这个算法就是CFS(Completely Fair Scheduler)。在 LInux Kernel Development 一书中用一句话总结CFS进程调度:
运行rbtree树中最左边叶子节点所代表的那个进程。
在一个自平衡二叉搜索树红黑树rbtree的树节点中,存储了下一个应该运行进程的数据。在这里我们看到了二叉搜索树的完美运用。具体可参见Introduction to Algorithms Page 174~182。
而进程调度的主要入口函数就是schedule()。它定义在文件kernel/sched.c中。
我们先看一个在等待队列中进行进程调度的例子:
DEFINE_WAIT(wait); //申明等待队列
add_wait_queue(q,&wait); //把我们用的q队列加入到wait等待队列中
while(!condition){ //当等待事件没有来临时
prepare_to_wait(&q,&wait,TASK_INTERRUPTIBLE);
//将q从TASK_RUNNING或者其他状态置为TASK_INTERRUPTIBLE不可运行的休眠状态。
//同时接受信号&&事件来唤醒它
if(signal_pending(current)) //如果有来自从处理器的信号
{ processingsignal();}//处理信号
schedule(); //调用红黑树中的下一个进程
}
finish_wait(&q,&wait); //将进程设置为TASK_RUNNING并移出等待队列.
其实我们可以这么理解这段代码。现在有一个任务要等待事件到来才能运行,怎么实现呢?就是阻塞加查询。但是这样会使得这段代码独占整个操作系统。为了解决这个问题,就在阻塞查询之中加入了队列和进程调度schedule(),从而不耽误其它线程的执行。
再来看一看schedule()函数的结构:
schedule()函数结构
asmlinkage void __sched schedule(void) ///定义通过堆栈传值
{
struct task_struct *prev, *next;
unsigned long *switch_count;
struct rq *rq;
int cpu;
/*At the end of this function, it will check if need_resched() return
true, if that indeed happen, then goto here.*/
need_resched:
/*current process won‘t be preempted after call preemept_disable()*/
preempt_disable(); //不让优先占有当前进程
cpu = smp_processor_id();
rq = cpu_rq(cpu);
/* rcu_sched_qs ? */
rcu_sched_qs(cpu);
/* prev point to current task_struct */
prev = rq->curr;
/* get current task_struct‘s context switch count */
switch_count = &prev->nivcsw;
/* kernel_flag is "the big kernel lock".
* This spinlock is taken and released recursively by lock_kernel()
* and unlock_kernel(). It is transparently dropped and reacquired
* over schedule(). It is used to protect legacy code that hasn‘t
* been migrated to a proper locking design yet.
* In task_struct, there is a member lock_depth, which is inited -1,
* indicates that the current task have no kernel lock.
* When lock_depth >=0 indicate that it own kernel lock.
* During context switching, it is not permitted that the task
* switched away remain own kernel lock , so in scedule(),it
* call release_kernel_lock(), release kernel lock.
*/
release_kernel_lock(prev);
need_resched_nonpreemptible:
schedule_debug(prev);
if (sched_feat(HRTICK))
hrtick_clear(rq);
/* occupy current rq‘s lock */
raw_spin_lock_irq(&rq->lock); //占有rq自旋锁
/* update rq‘s clock,this function will call sched_clock_cpu() */
update_rq_clock(rq);
/* clear bit in task_struct‘s thread_struct‘s flag TIF_NEED_RESCHED.
* In case that it will be rescheduled, because it prepare to give
* up cpu.
*/
clear_tsk_need_resched(prev);
if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
if (unlikely(signal_pending_state(prev->state, prev)))
prev->state = TASK_RUNNING;
else
deactivate_task(rq, prev, 1);
switch_count = &prev->nvcsw;
}
/* For none-SMP, pre_schedule is NULL */
pre_schedule(rq, prev);
if (unlikely(!rq->nr_running))
idle_balance(cpu, rq);
put_prev_task(rq, prev);
next = pick_next_task(rq);
if (likely(prev != next)) {
sched_info_switch(prev, next);
perf_event_task_sched_out(prev, next);
rq->nr_switches++;
rq->curr = next;
++*switch_count;
context_switch(rq, prev, next); /* unlocks the rq */
/*
* the context switch might have flipped the stack from under
* us, hence refresh the local variables.
*/
cpu = smp_processor_id();
rq = cpu_rq(cpu);
} else
raw_spin_unlock_irq(&rq->lock);//current task still occupy cpu
post_schedule(rq);
if (unlikely(reacquire_kernel_lock(current) < 0)) {
prev = rq->curr;
switch_count = &prev->nivcsw;
goto need_resched_nonpreemptible;
}
preempt_enable_no_resched();
if (need_resched())
goto need_resched;
}
EXPORT_SYMBOL(schedule);
schedule()函数的目的在于用另一个进程替换当前正在运行的进程。因此,这个函数的主要结果就是设置一个名为next的变量,以便它指向所选中的 代替current的进程的描述符。如果在系统中没有可运行进程的优先级大于current的优先级,那么,结果是next与current一致,没有进程切换发生。
References
[1].UNDERSTANDING THE LINUX KERNEL. Page 276
[2].Linux Kernel Development. Page 52
[3].http://hi.baidu.com/zengzhaonong/item/20d9e8207b04cb8f6e2cc323