Getting into Linux Kernel Development 【Share】

Ref:

https://www.cyphar.com/blog/post/getting-into-linux-kernel-development

I don‘t know about you, but I‘ve always found the idea of operating system kernels to be some mysterious and esoteric programming skill. Its importance cannot be overstated of course, but I‘ve always felt that kernel programming and regular programming were two very separate skills.

I‘ve recently had some patches merged into the Linux kernel, and thought it might be interesting to describe what I learnt from this brief dip into kernel development and how someone who is new to it might go about getting into this space.

A Bit of Backstory

Over the past year or so, I‘ve been fascinated in reading kernel code. It all went over my head, of course, but I thought it was interesting seeing how the syscalls we used every day were implemented. I‘d been pining for a good enough excuse to dip my toe into this weird and wonderful world of which I knew nothing about.

As luck would have it, I stumbled upon a feature request in libcontainer (now part of runC, and so the bug report has been deleted), and thought to myself "well, it‘s now or never". The bug report stated that "it would be nice if we could limit the number of PIDs in a cgroup", and I thought it would be a fairly easy project to do. "I‘m quite a dab hand at C, how hard could it be?". The bug report linked to an old patchset (circa 2011) which would obviously need quite a bit of work to be brought up to date with the current state of the kernel. It turned out that it needed a complete rewrite, because of how much the internal APIs had changed in that time (and the fact that some of the hooks it depended on were removed because they were incredibly racy).

So, with all that in mind, I was ready to start writing a new version of the same patchset.

Advice for Getting Started

The easiest way to get started (from what I‘ve heard) is to find a driver that doesn‘t work properly or look for some easy-looking bug report. I wouldn‘t recommend going about getting into the Linux kernel the way I did, I was thrown quite quickly into the deep end with no documentation in sight.

It‘s also a good idea to find someone on IRC (##kernel on freenode), or just email maintainers to ask them about what the problem you want to solve is (or even ask them if they have any bugs that you might be able to fix) and how to go about solving it. Maintainers are people too, so don‘t spam them if they don‘t reply within 15 minutes of your first email.

Ever-Changing APIs

One of the things that Linux prides itself on is the fact that it has a completely stable ABI. This is an interesting contrast to the fact that there is no stable internal API inside the kernel. This, of course, makes perfect sense in the context of the kernel, but it does make reviving old patchsets nigh impossible without a complete rewrite.

Although, doing complete rewrites of old patchsets is actually quite a good thing. It means that you get some pointers as to where in the kernel your changes need to be made (without having to do a manual traceback from a syscall or trying to grep the source tree), but it also doesn‘t spoon feed you. You need to figure out which kernel APIs you need to leverage and which locking semantics you need to obey.

Unfortunately, because of the lack of a stable API, this means that documentation about things like locking semantics and current APIs is basically non-existent. If you‘re okay with asking people about what would be the best way to do something, then you‘ll be set (Google won‘t really help you here). If you aren‘t good with asking questions, you might be able to take advantage of some of the debugging tools Linux has available. Things like lockdep and PROVE_RCU are very useful for making sure you‘re following valid locking semantics. But ultimately, you‘ll need to ask someone a question eventually, you might as well start getting used to emailing around and asking people questions.

Coding Style

Make sure you follow the Linux kernel coding style. Most maintainers will not even look at the contents of your patch if the coding style is not followed. Sure, you might not agree with some of the points (the 8 column tabs sort of annoy me every once in a while, and the 80 character limit is really annoying) but it‘s their choice what coding style they use for their projects. Sometimes you‘ve got to live and let live.

Iterative Development

Once you‘ve got the first version of your patch working and have tested it on your machine (and hopefully some more machines), you‘re ready to send it out to the mailing list. There are scripts to tell you who you should Cc: your patches to, and I‘d recommend using them. If you‘re sending patchsets, I prefer to just send email using git send-email, it‘s quick and dirty and works pretty well.

A very important thing to make sure you do when you start sending patchsets is explaining why your patch is so important, how it is beneficial to users, why your method of solving the problem is the best, why the problem is a real problem that needs to be solved in the kernel, etc. If you don‘t make an argument for your patch to be merged, it won‘t be merged (you‘re really the only one who is completely behind your patch).

The first thing that‘ll happen when you send the first version of your patchset to the mailing list is that it will be outright rejected. It‘s very uncommon that a patchset is merged immediately. Maybe you could‘ve done something better, maybe you didn‘t grab some locks in the right order, maybe you missed a race condition, etc. It‘s important not to be dissuaded when your patch gets rejected. It happens to everyone, just take the feedback you got and move on. Sometimes maintainers will reject a patch for very minor things (bad formatting or other such non-functional changes). It‘s important to take all of their considerations on board (unless you really disagree with them, in which case you escalate the issue to some other maintainers or people higher-up in the food chain), since they probably know more about the code you‘re changing than you do.

After the first round of nits have been fixed, you send out the patches again with your changes pointed out in the cover letter. Then you get some more feedback, you update your code or discuss it, and the cycle continues. Depending on how complicated the issue is, it may take up to 10 versions to get a version that is good enough to be merged into the kernel. At that point, it‘s out of your hands. Your patch will be merged into a maintainer‘s tree, and then that maintainer‘s tree will be merged into Linus‘ tree at some point in the future (probably the next -rc1).

Why so Complicated?

Kernel code is widely considered to be very complicated code. There are whole bunch of weird APIs that you call out to, with global variables and macros thrown around everywhere. Trying to figure out which code gets executed from a given syscall is quite complicated. It gets even more complicated when you consider the fact that the Linux kernel is basically one of the largest multi-threaded programs in existence. Lots of attention has to be paid to potential race conditions, and quite a lot of this is undocumented.

For me, the easiest way to get to understanding code is to just read it. Dive in feet first, read a section of code and then read the code of all of the functions it calls out to. Repeat until you‘ve read all of the code that you can. By doing this process for enough of the sections of the kernel, you start getting an idea about how the kernel functions, what kind of APIs to use where, etc.

Linux‘s complexity doesn‘t necessarily come from the fact that it is a kernel (really, kernel space just has a few different rules than user space and you can get used to that idea quite quickly), rather from the fact that it is an extraordinarily large project with thousands of contributors every release.

All‘s Well That Ends Well

I got an email this morning from Tejun Heo (the maintainer of control groups in the kernel) that my patches to add the pids cgroup have been merged into his tree. This process took me several long months to reach, but that‘s mainly because of my lack of experience with kernel development. A lot of this stuff is still new to me, and I‘m still learning a lot more about the kernel every day by just reading code and trying to fix bugs.

So, all‘s well that ends well. Don‘t be afraid of diving into kernel development feet first. Books won‘t help you (they‘re all out of date), but reading the code will. As an old man once told me, "Read the source, Luke!"

时间: 2024-10-27 05:49:57

Getting into Linux Kernel Development 【Share】的相关文章

linux kernel menuconfig【转载】

原文网址:http://www.cnblogs.com/kulin/archive/2013/01/04/linux-core.html Linux内核裁减 (1)安装新内核: i)将新内核copy到/usr/src下, #tar xzvf linux-2.6.38.4.tar.gz -----解压缩. ii) 将名为linux的符号链接删掉,这是旧版本内核的符号链接. #ln -s linux-2.6.38.4 linux ------建立linux-2.6.38.4的符号链接linux. (

linux基础学习【4】

系统进程 一.什么是进程 进程 : 一个正在运行中的程序 程序被触发后,执行者的权限与属性,程序的程序码与所需数据等都会被载入内存中,操作系统会给予这个内存内的单元一个识别码 (PID). 二.查看进程 1.图形方式查看 命令:`gnome-system-monitor` 2.进程查看命令 命令:`ps` ps -A/-e 显示所有(包括不同终端不同用户)进程(PID,TTY,TIME,CMD) ps -a 当前环境中运行的进程,不包含环境信息(PID,TTY,TIME,CMD) ps -u (

linux基础学习【5】

sshd服务,服务管理及文件传输 一.控制服务 1.什么是服务 2.用什么控制服务 系统初始化进程可以进行相应的控制 3.当前系统初始化进程是什么 systemd 系统初始化进程 pstree 显示系统中的进程树 进程树 4.系统控制命令 ssh(client)客户端---->sshd(server)服务器 命令:`systemctl` systemctl status sshd 查看sshd服务的状态inactive(不可用)/active(可用) systemctl start sshd 开

linux基础学习【1】

2018.09.22linux 发展历史unix 免费 -> unix 收费-> 安德鲁教授 minix-> 芬兰学生linus linux linux基本操作 1.输入法调整 Application -> System Tools -> Setting -> rejion&language -> Input source -> + 2.虚拟机管理 打开虚拟机 rht-vmctl start desktop显示虚拟机 rht-vmctl view d

Linux snacks from <linux kernel development>

introduction to the Linux kernel 1.operating system 1) considered the parts of the system 2) responsible for basic use and administration. 3) includes the kernel and device drivers, boot loader, command shell or other user interface, and basic file a

linux kernel development

the learning curve become longer and steeper . as the increasingly complex , the newbie or newcomers become harder . to make the source code sensible interface , consistent layout , elegant code . " do one thing , and do it well  " , " simp

Linux学习笔记---【1】

什么是POSIX? 为何说Linux使用POSIX对于开发有很好的影响? POSIX是可携式操作系统接口(Portable Operating System Interface)的缩写,重点在于规范内核与应用程序之间的接口,是由美国电气与电子工程师学会(IEEE)发布的一项标准! 因为POSIX标准主要是针对UNIX于一些软件运行时候的标准规范,只要依据这些标准规范来设计的内核与软件,理论上,就可以搭配在一起执行了.而Linux的开发就是依据这个POSIX的标准规范,UNIX上的软件也是遵循这个

掌握 Linux 调试技术【转】

转自:https://www.ibm.com/developerworks/cn/linux/sdk/l-debug/index.html 您可以用各种方法来监控运行着的用户空间程序:可以为其运行调试器并单步调试该程序,添加打印语句,或者添加工具来分析程序.本文描述了几种可以用来调试在 Linux 上运行的程序的方法.我们将回顾四种调试问题的情况,这些问题包括段错误,内存溢出和泄漏,还有挂起. 2 评论 Steve Best ([email protected])JFS 核心小组成员,IBM 2

Linux电源管理【转】

转自:http://www.cnblogs.com/sky-zhang/archive/2012/06/05/2536807.html PM notifier机制: 应用场景: There are some operations that subsystems or drivers may want to carry out before hibernation/suspend or after restore/resume, but they require the system to be