Linux内核suspend状态

Linux内核支持多种类型的睡眠状态，通过设置不同的模块进入低功耗模式来达到省电功能。目前存在四种模式：suspend to idle、power-on standby（Standby）、suspend to ram（STR）和sudpend to disk（Hibernate），分别对应ACPI状态的S0、S1、S3和S4。

Suspend to idle完全是软件相关的并且尽量将CPU维持在深度idle状态。

Power-on standby设置设备进入低功耗模式并且关闭所有non-boot CPU。

Suspend to ram就更进一步，关闭所有CPU并且设置RAM进入自刷新模式。

Suspend to disk是最省功耗的模式，通过尽可能的关闭设备，包括RAM。RAM的数据会被写入磁盘中，在resume的时候读回到RAM。

下面用STR表示Suspend to RAM，STI表示Suspend to Idle。

详情请参考：http://www.linaro.org/blog/suspend-to-idle/

STR 和STI区别

如何让HiKey进入STR/STI并唤醒？

可以通过配置GPIO作为唤醒源，或者通过RTC作为唤醒源，延时一定时间来唤醒。

检查是否存在/sys/class/rtc/rtc0/wakealarm，入不存在则需要打开CONFIG_RTC_DRV_PL031。

写入wakealarm的参数，表示在多少秒之后resume唤醒，退出suspend。

写mem进入state，是系统进入suspend流程。

adb root && adb remount
adb shell "echo +10 > /sys/class/rtc/rtc0/wakealarm && echo mem > /sys/power/state"

下面是HiKey的log：

[ 1667.963901] PM: suspend entry 1970-01-01 00:29:07.693811637 UTC
[ 1667.969940] PM: Syncing filesystems ... done.
[ 1667.982169] Freezing user space processes ...
[ 1667.988694] dwc2 f72c0000.usb: dwc2_hsotg_ep_stop_xfr: timeout DOEPCTL.EPDisable
[ 1667.996398] dwc2 f72c0000.usb: GINNakEff triggered
[ 1668.005715] (elapsed 0.019 seconds) done.
[ 1668.009858] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[ 1668.019035] Suspending console(s) (use no_console_suspend to debug)
[ 1668.053839] PM: suspend of devices complete after 27.658 msecs
[ 1668.056277] PM: late suspend of devices complete after 2.415 msecs
[ 1668.057494] PM: noirq suspend of devices complete after 1.207 msecs
[ 1668.057500] Disabling non-boot CPUs ...
[ 1668.058575] CPU7: update max cpu_capacity 1024
[ 1668.074889] CPU1: shutdown
[ 1668.090447] psci: Retrying again to check for CPU kill
[ 1668.090452] psci: CPU1 killed.
[ 1668.110472] CPU0: update max cpu_capacity 1024
[ 1668.122990] CPU2: shutdown
[ 1668.138453] psci: Retrying again to check for CPU kill
[ 1668.138459] psci: CPU2 killed.
[ 1668.158467] CPU7: update max cpu_capacity 1024
[ 1668.170909] CPU3: shutdown
[ 1668.186453] psci: Retrying again to check for CPU kill
[ 1668.186458] psci: CPU3 killed.
[ 1668.202446] CPU7: update max cpu_capacity 1024
[ 1668.214925] CPU4: shutdown
[ 1668.230450] psci: Retrying again to check for CPU kill
[ 1668.230456] psci: CPU4 killed.
[ 1668.246443] CPU7: update max cpu_capacity 1024
[ 1668.254736] CPU5: shutdown
[ 1668.270454] psci: Retrying again to check for CPU kill
[ 1668.270459] psci: CPU5 killed.
[ 1668.286440] CPU7: update max cpu_capacity 1024
[ 1668.298735] CPU6: shutdown
[ 1668.314450] psci: Retrying again to check for CPU kill
[ 1668.314456] psci: CPU6 killed.
[ 1668.346706] CPU7: shutdown
[ 1668.362444] psci: Retrying again to check for CPU kill
[ 1668.362451] psci: CPU7 killed.
<<<<<<wakealarm
[ 1668.375740] Enabling non-boot CPUs ...
[ 1668.395368] Detected VIPT I-cache on CPU1
[ 1668.395428] CPU1: update cpu_capacity 1024
[ 1668.395433] CPU1: Booted secondary processor [410fd033]
[ 1668.395922] cache: parent cpu1 should not be sleeping
[ 1668.396127] CPU1 is up
[ 1668.398426] CPU0: update max cpu_capacity 1024
[ 1668.415481] Detected VIPT I-cache on CPU2
[ 1668.415525] CPU2: update cpu_capacity 1024
[ 1668.415528] CPU2: Booted secondary processor [410fd033]
[ 1668.416037] cache: parent cpu2 should not be sleeping
[ 1668.416231] CPU2 is up
[ 1668.418426] CPU0: update max cpu_capacity 1024
[ 1668.435687] Detected VIPT I-cache on CPU3
[ 1668.435731] CPU3: update cpu_capacity 1024
[ 1668.435734] CPU3: Booted secondary processor [410fd033]
[ 1668.436358] cache: parent cpu3 should not be sleeping
[ 1668.436561] CPU3 is up
[ 1668.438431] CPU0: update max cpu_capacity 1024
[ 1668.456000] Detected VIPT I-cache on CPU4
[ 1668.456072] CPU4: update cpu_capacity 1024
[ 1668.456075] CPU4: Booted secondary processor [410fd033]
[ 1668.456935] cache: parent cpu4 should not be sleeping
[ 1668.457133] CPU4 is up
[ 1668.458436] CPU0: update max cpu_capacity 1024
[ 1668.476115] Detected VIPT I-cache on CPU5
[ 1668.476148] CPU5: update cpu_capacity 1024
[ 1668.476152] CPU5: Booted secondary processor [410fd033]
[ 1668.477156] cache: parent cpu5 should not be sleeping
[ 1668.477352] CPU5 is up
[ 1668.478437] CPU0: update max cpu_capacity 1024
[ 1668.496340] Detected VIPT I-cache on CPU6
[ 1668.496373] CPU6: update cpu_capacity 1024
[ 1668.496377] CPU6: Booted secondary processor [410fd033]
[ 1668.497588] cache: parent cpu6 should not be sleeping
[ 1668.497792] CPU6 is up
[ 1668.498444] CPU0: update max cpu_capacity 1024
[ 1668.516578] Detected VIPT I-cache on CPU7
[ 1668.516611] CPU7: update cpu_capacity 1024
[ 1668.516615] CPU7: Booted secondary processor [410fd033]
[ 1668.518090] cache: parent cpu7 should not be sleeping
[ 1668.518290] CPU7 is up
[ 1668.518397] CPU7: update max cpu_capacity 1024
[ 1668.519227] PM: noirq resume of devices complete after 0.745 msecs
[ 1668.520384] PM: early resume of devices complete after 0.862 msecs
[ 1668.542885] mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
[ 1668.618289] mmc_host mmc0: Bus speed (slot 0) = 51756522Hz (slot req 52000000Hz, actual 51756522HZ div = 0)
[ 1668.714903] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31)
[ 1668.764121] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot req 25000000Hz, actual 24800000HZ div = 0)
[ 1668.765001] PM: resume of devices complete after 244.605 msecs
[ 1669.132522] Restarting tasks ...
[ 1669.136471] ueventd: fixup /sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq 1000 1000 664
[ 1669.137761] done.
[ 1669.148351] ueventd: fixup /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq 1000 1000 664
[ 1669.157242] PM: suspend exit 1970-01-01 00:29:17.630614166 UTC
[ 1669.172643] ueventd: fixup /sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq 1000 1000 664
[ 1669.181547] ueventd: fixup /sys/devices/system/cpu/cpu2/cpufreq/scaling_min_freq 1000 1000 664
[ 1669.205449] ueventd: fixup /sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq 1000 1000 664
[ 1669.214348] ueventd: fixup /sys/devices/system/cpu/cpu3/cpufreq/scaling_min_freq 1000 1000 664
[ 1669.238311] ueventd: fixup /sys/devices/system/cpu/cpu4/cpufreq/scaling_max_freq 1000 1000 664
[ 1669.247221] ueventd: fixup /sys/devices/system/cpu/cpu4/cpufreq/scaling_min_freq 1000 1000 664
[ 1669.271362] ueventd: fixup /sys/devices/system/cpu/cpu5/cpufreq/scaling_max_freq 1000 1000 664
[ 1669.280255] ueventd: fixup /sys/devices/system/cpu/cpu5/cpufreq/scaling_min_freq 1000 1000 664
[ 1669.304141] ueventd: fixup /sys/devices/system/cpu/cpu6/cpufreq/scaling_max_freq 1000 1000 664
[ 1669.313161] ueventd: fixup /sys/devices/system/cpu/cpu6/cpufreq/scaling_min_freq 1000 1000 664
[ 1669.337039] ueventd: fixup /sys/devices/system/cpu/cpu7/cpufreq/scaling_max_freq 1000 1000 664
[ 1669.345936] ueventd: fixup /sys/devices/system/cpu/cpu7/cpufreq/scaling_min_freq 1000 1000 664

suspend/resume的latency分析手段

analyze_suspend.py v3.0

在kernel的scripts中，这个工具可以帮助内核和OS开发者优化suspend/resume时间。

在打开一系列内核选项之后，此工具就可以执行suspend操作，然后抓取dmesg和ftrace数据知道resume结束。

这些数据会按照时间线显示每个设备，并且显示占用最多suspend/resume时间的设备或者子系统的调用关系详图。

执行工具后，会根据时间生成一个子目录，里面包含：html、dmesg和原始ftrace文件。

下面简单看一下工具选项：

Options:
[general]
    -h          Print this help text
    -v          Print the current tool version
    -verbose    Print extra information during execution and analysis
    -status     Test to see if the system is enabled to run this tool
    -modes      List available suspend modes 显示当前支持的suspend模式
    -m mode     Mode to initiate for suspend [‘freeze‘, ‘mem‘, ‘disk‘] (default: mem) 设置进入何种模式的suspend
    -rtcwake t Use rtcwake to autoresume after <t> seconds (default: disabled) 使用rtc来唤醒，参数是间隔时间
[advanced]
    -f          Use ftrace to create device callgraphs (default: disabled) 基于ftrace生成调用关系图
    -filter "d1 d2 ..." Filter out all but this list of dev names
    -x2         Run two suspend/resumes back to back (default: disabled)
    -x2delay t Minimum millisecond delay <t> between the two test runs (default: 0 ms)
    -postres t Time after resume completion to wait for post-resume events (default: 0 S)
    -multi n d Execute <n> consecutive tests at <d> seconds intervals. The outputs will
                be created in a new subdirectory with a summary page.
[utilities]
    -fpdt       Print out the contents of the ACPI Firmware Performance Data Table
    -usbtopo    Print out the current USB topology with power info
    -usbauto    Enable autosuspend for all connected USB devices
[android testing]
    -adb binary Use the given adb binary to run the test on an android device. 参数需要给出adb路径，工具就会对Android设备进行测试，并将结果pull出来。有一点需要注意，在此之前确保adb具有root权限。
                The device should already be connected and with root access.
                Commands will be executed on the device using "adb shell"
[re-analyze data from previous runs] 针对之前测试数据重新分析
    -ftrace ftracefile Create HTML output using ftrace input
    -dmesg dmesgfile    Create HTML output using dmesg (not needed for kernel >= 3.15)
    -summary directory Create a summary of all test in this dir

在了解了工具使用方法之后，就可以进行相关测试了。

Android

1.Android上测试STR，suspend/resume共10次，每次间隔5秒。-f在Android平台未实现。

./analyze_suspend.py -adb /usr/bin/adb -rtcwake 5 -multi 10 5 -m mem

2.Android上测试STI，suspend/resume共10次，每次间隔5秒。

./analyze_suspend.py -adb /usr/bin/adb -rtcwake 5 -multi 10 5 -m freeze

存在的问题：analyze_suspend.py不支持Android的rtcwakeup。已经在如下fix：

https://github.com/arnoldlu/common-use/blob/master/tools/analyze_suspend.py

下面是HiKey上测试结果，可以看出两个数据都不够稳定。mem的suspend和resume平均值都比较高。

Ubuntu

此工具在Ubuntu上显示了更强大的功能。

支持了callgraph功能之后，更能清晰地分析每个设备或者子系统的suspend/resume占用的时间。

sudo ./analyze_suspend.py -rtcwake 5 -multi 10 5 -f -m mem
sudo ./analyze_suspend.py -rtcwake 5 -multi 10 5 -f -m freeze

在对比两种不同suspend模式后，发现freeze花费的时间要比mem少。这也符合预期，但是没有功耗数据?_?。

但是mem模式的时间数据，不够稳定…?_?

下面着重分析一下如何基于此工具分析。

工具界面总体分析

最上面显示Kernel Suspend Time和Kernel Resume Time，可以从总体上查看是否有回退或者进步。

再下面是一些缩放按钮。

然后就是基于timeline的图表，比对颜色示意图，可以清晰看出suspend prepare、suspend、suspend late、suspend irq、suspend machine、resume machine、resume irq、resume early、resume和resume complete的分布。

最下面是每个模块、子系统的详细函数调用图以及开始时间、消耗时间。