从某种意义上,函数start_kernel就好像一般可执行程序中的主函数main,系统进入这个函数之前已经进行了一些最低限度的初始化,再往前研究就涉及很多硬件相关及编程语言了,这里是较高层次的初始化,基本是C代码,一直想搞清楚内核的初始化流程,好对整个linux内核有更深理解。分析程序习惯性的找main函数,那么就从这个start_kernel看看。
这个函数在init/main.c:
asmlinkage void __init start_kernel(void) { char * command_line; extern const struct kernel_param __start___param[], __stop___param[]; /* * Need to run as early as possible, to initialize the * lockdep hash: */ lockdep_init(); smp_setup_processor_id(); debug_objects_early_init();
一个一个看看:
void lockdep_init(void) { int i; /* * Some architectures have their own start_kernel() * code which calls lockdep_init(), while we also * call lockdep_init() from the start_kernel() itself, * and we want to initialize the hashes only once: */ if (lockdep_initialized) return; for (i = 0; i < CLASSHASH_SIZE; i++) INIT_LIST_HEAD(classhash_table + i); for (i = 0; i < CHAINHASH_SIZE; i++) INIT_LIST_HEAD(chainhash_table + i); lockdep_initialized = 1; }
注释写得很清楚,有些体系结构有自己的start_kernel也会调用lockdep_init,这里只会调用一次,来初始化hash表。
这个hash表示干什么用的呢?
/* * We keep a global list of all lock classes. The list only grows, * never shrinks. The list is only accessed with the lockdep * spinlock lock held. */ LIST_HEAD(all_lock_classes); /* * The lockdep classes are in a hash-table as well, for fast lookup: */ #define CLASSHASH_BITS (MAX_LOCKDEP_KEYS_BITS - 1) #define CLASSHASH_SIZE (1UL << CLASSHASH_BITS) #define __classhashfn(key) hash_long((unsigned long)key, CLASSHASH_BITS) #define classhashentry(key) (classhash_table + __classhashfn((key))) static struct list_head classhash_table[CLASSHASH_SIZE]; /* * We put the lock dependency chains into a hash-table as well, to cache * their existence: */ #define CHAINHASH_BITS (MAX_LOCKDEP_CHAINS_BITS-1) #define CHAINHASH_SIZE (1UL << CHAINHASH_BITS) #define __chainhashfn(chain) hash_long(chain, CHAINHASH_BITS) #define chainhashentry(chain) (chainhash_table + __chainhashfn((chain))) static struct list_head chainhash_table[CHAINHASH_SIZE]; /* * The hash key of the lock dependency chains is a hash itself too: * it's a hash of all locks taken up to that lock, including that lock. * It's a 64-bit hash, because it's important for the keys to be * unique. */ #define iterate_chain_key(key1, key2) (((key1) << MAX_LOCKDEP_KEYS_BITS) ^ ((key1) >> (64-MAX_LOCKDEP_KEYS_BITS)) ^ (key2))
把这些代码贴过来主要是看看注释,这个hash表是个全局的锁链表,lock dependency哈希表。个人理解是锁的初始化,不再深入研究。
接着往下看:
void __init smp_setup_processor_id(void) { int i; u32 mpidr = is_smp() ? read_cpuid_mpidr() & MPIDR_HWID_BITMASK : 0; u32 cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0); cpu_logical_map(0) = cpu; for (i = 1; i < nr_cpu_ids; ++i) cpu_logical_map(i) = i == cpu ? 0 : i; /* * clear __my_cpu_offset on boot CPU to avoid hang caused by * using percpu variable early, for example, lockdep will * access percpu variable inside lock_release */ set_my_cpu_offset(0); printk(KERN_INFO "Booting Linux on physical CPU 0x%x\n", mpidr); }
查看是否是多处理器平台
/* * Return true if we are running on a SMP platform */ static inline bool is_smp(void) { #ifndef CONFIG_SMP return false; #elif defined(CONFIG_SMP_ON_UP) extern unsigned int smp_on_up; return !!smp_on_up; #else return true; #endif }
config配置决定是否是多处理器的平台,如果是,重置my_cpu_offset,否则可能会错用前一个CPU的数据,不过一般我们都是单CPU,这里就不会有什么操作,暂且不管。
/* * Called during early boot to initialize the hash buckets and link * the static object pool objects into the poll list. After this call * the object tracker is fully operational. */ void __init debug_objects_early_init(void) { int i; for (i = 0; i < ODEBUG_HASH_SIZE; i++) raw_spin_lock_init(&obj_hash[i].lock); for (i = 0; i < ODEBUG_POOL_SIZE; i++) hlist_add_head(&obj_static_pool[i].node, &obj_pool); }
初始化buckets,即obj_hash。把static object pool数组的元素初始化成链表。
struct debug_bucket { struct hlist_head list; raw_spinlock_t lock; };
debug_bucket成员就是链表和锁。
/** * struct debug_obj - representaion of an tracked object * @node: hlist node to link the object into the tracker list * @state: tracked object state * @astate: current active state * @object: pointer to the real object * @descr: pointer to an object type specific debug description structure */ struct debug_obj { struct hlist_node node; enum debug_obj_state state; unsigned int astate; void *object; struct debug_obj_descr *descr; };
object_pool是结构体debug_obj,每个成员注释的写的很清楚,不再赘述。
/** * struct debug_obj_descr - object type specific debug description structure * * @name: name of the object typee * @debug_hint: function returning address, which have associated * kernel symbol, to allow identify the object * @fixup_init: fixup function, which is called when the init check * fails * @fixup_activate: fixup function, which is called when the activate check * fails * @fixup_destroy: fixup function, which is called when the destroy check * fails * @fixup_free: fixup function, which is called when the free check * fails * @fixup_assert_init: fixup function, which is called when the assert_init * check fails */ struct debug_obj_descr { const char *name; void *(*debug_hint) (void *addr); int (*fixup_init) (void *addr, enum debug_obj_state state); int (*fixup_activate) (void *addr, enum debug_obj_state state); int (*fixup_destroy) (void *addr, enum debug_obj_state state); int (*fixup_free) (void *addr, enum debug_obj_state state); int (*fixup_assert_init)(void *addr, enum debug_obj_state state); };
其中的debug_obj_descr结构体,定义及注释也写的很清楚。
start_kernel最开始的这三个函数,首先初始化了锁的dependcy的hash表及debug,为后续资源加锁访问,debug调试做准备。
继续看start_kernel:
/* * Set up the the initial canary ASAP: */ boot_init_stack_canary(); cgroup_init_early();
stack_canary的是带防止栈溢出攻击保护的堆栈。
/* * Initialize the stackprotector canary value. * * NOTE: this must only be called from functions that never return, * and it must always be inlined. */ static __always_inline void boot_init_stack_canary(void) { unsigned long canary; /* Try to get a semi random initial value. */ get_random_bytes(&canary, sizeof(canary)); canary ^= LINUX_VERSION_CODE; current->stack_canary = canary; __stack_chk_guard = current->stack_canary; }
具体实现没有深究,从注释可以看出,这个函数是初始化一个带保护的栈的。
/** * cgroup_init_early - cgroup initialization at system boot * * Initialize cgroups at system boot, and initialize any * subsystems that request early init. */ int __init cgroup_init_early(void) { struct cgroup_subsys *ss; int i; atomic_set(&init_css_set.refcount, 1); INIT_LIST_HEAD(&init_css_set.cgrp_links); INIT_LIST_HEAD(&init_css_set.tasks); INIT_HLIST_NODE(&init_css_set.hlist); css_set_count = 1; init_cgroup_root(&cgroup_dummy_root); cgroup_root_count = 1; RCU_INIT_POINTER(init_task.cgroups, &init_css_set); init_cgrp_cset_link.cset = &init_css_set; init_cgrp_cset_link.cgrp = cgroup_dummy_top; list_add(&init_cgrp_cset_link.cset_link, &cgroup_dummy_top->cset_links); list_add(&init_cgrp_cset_link.cgrp_link, &init_css_set.cgrp_links); /* at bootup time, we don't worry about modular subsystems */ for_each_builtin_subsys(ss, i) { BUG_ON(!ss->name); BUG_ON(strlen(ss->name) > MAX_CGROUP_TYPE_NAMELEN); BUG_ON(!ss->css_alloc); BUG_ON(!ss->css_free); if (ss->subsys_id != i) { printk(KERN_ERR "cgroup: Subsys %s id == %d\n", ss->name, ss->subsys_id); BUG(); } if (ss->early_init) cgroup_init_subsys(ss); } return 0; }
这个函数是初始化cgroup的。cgroup就是control group。代码没有深究,查看帮助文档,可以知道他是干什么的
“Control Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour.”
即提供一种机制分层的区分进程,以及这些进程的子进程。
start_kernel()分析(一)