https://lwn.net/Articles/499190/
https://github.com/andreoli/fulltrace
Prominent features in Linux 3.5
1.1. ext4 metadata checksums
Modern filesystems such as ZFS and Btrfs have proved that ensuring the integrity of the filesystem using checksums is a valuable feature. Ext4 has added the ability to store checksums of various metadata fields. Every time a metadata field is read, the checksum of the read data is compared with the stored checksums, if they are different it means that the medata is corrupted (note that this feature doesn‘t cover data, only the internal metadata structures, and it doesn‘t have "self-healing" capabilities). The amount of code added to implement this feature is: 1659 insertions(+), 162 deletions(-).
Any ext4 filesystem can be upgraded to use checksums using the "tune2fs -O metadata_csum" command, or "mkfs -O metadata_csum" at creation time. Once this feature is enabled in a filesystem, older kernels with no checksum support will only be able to mount it in read-only mode.
As far as performance impact goes, it shouldn‘t be noticeable for common desktop and server workloads. A mail server ffsb simulation show nearly no change. On a test doing only file creation and deletion and extent tree modifications, a performance drop of about 20 percent was measured. However, it‘s a workload very heavily oriented towards metadata, in most real-world workloads metadata is usually a small fraction of total IO, so unless your workload is metadata-oriented, the cost of enabling this feature should be negligible.
Recommended LWN article: "Improving ext4: bigalloc, inline data, and metadata checksums"
Implementation details: Ext4 Metadata checksums
Code: (commit 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24
1.2. Uprobes: userspace probes
Uprobes, the user-space counterpart of kprobes, enables to place performance probes in any memory address of a user application, and collect debugging and performance information non-disruptively, which can be used to find performance problems. These probes can be placed dynamically in a running process, there is no need to restart the program or modify the binaries. The probes are usually managed with a instrumentation application, such as perf probe, systemtap or LTTng.
A sample usage of uprobes with perf could be to profile libc‘s malloc() calls:
- $ perf probe -x /lib64/libc.so.6 malloc -> Added new event: probe_libc:malloc (on 0x7eac0)
A probe has been created. Now, let‘s record the global usage of malloc across all the system during 1 second:
- $ perf record -e probe_libc:malloc -agR sleep 1
Now you can watch the results with the TUI interface doing "$ perf report", or watch a plain text output without the call graph info in the stdio output with "$ perf report -g flat --stdio"
If you don‘t know which function you want to probe, you can get a list of probe-able funcions in libraries and executables using the -F parameter, for example: "$ perf probe -F -x /lib64/libc.so.6" or "$ perf probe -F -x /bin/zsh". You can use multiple probes as well and mix them with kprobes and regular PMU events or kernel tracepoints.
The uprobes code is one of the longest standing out-of-the-tree patches. It originates from SystemTap and has been included for years in Fedora and RHEL kernels.
Recommended LWN article: Uprobes in 3.5
Code: (commit 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)