LWN: Linaro提出thermal pressure的patch

技術 · 發表 2019-04-29 11:04:15

摘要：目前在review的Linux kernel patch list裡面，有一組4月16日由Linaro的Thara Gopinath提出的patch，引入了一個Thermal Pressure的概念。我們來看看他是如何說的。 Linux kernel裡面現有的thermal gov...

目前在review的Linux kernel patch list裡面，有一組4月16日由Linaro的Thara Gopinath提出的patch，引入了一個Thermal Pressure的概念。我們來看看他是如何說的。

Linux kernel裡面現有的thermal governor，會在CPU過熱場景下通過限制CPU的最大執行頻率來做溫控的方案。也就是說CPU的最大計算能力同時也被限制住了。不過目前kernel裡面，task scheduler（程序排程器）沒有得到通知，因此不知道CPU的最大頻率被調低了。Thara提交的這組patch就是希望解決此問題的。他希望在溫度過高的情況下，各個task程序在各個CPU上能更合理的分配，從而達到較好的performance。

相對於CPU的最大計算能力，在溫度過高情況下降低的比例，就被稱為thermal pressure。瞬時性的thermal pressure很難被記錄下來，有時候也會記錄錯，因為計算能力被限制的時間點和scheduler真正記錄的時間點不是同一時刻進行的。這個方案裡面選用了weighted average per cpu的值（基於權重做平均的per cpu變數）來記錄thermal pressure隨著時間發生的變化。

這裡的權重是反映了CPU在被限制之後的最大頻率上執行的時間。thermal pressure既然是記錄一個平均值，那麼它就需要隨著時間的推移定期decay（衰減，就是減少太久時間以前的值對現在的影響）。這組patch定義了一個支援各種定製的decay period週期。

相關的測試是在hikey960的mainline kernel上進行的，使用了debian的系統。並且測試了aobench（針對真實世界的浮點數計算能力的benchmark），dhrystone和hackbench，都是在溫度較高的情況下測試的。測試中，big little系統裡的little core的CPU cooling都被disable了，這樣主要是big core來導致溫升，以及cpu cooling會限制big core的frequency，這樣scheduler就會把task也分配更多到little core上去。最終，這組patch也在db410c的v5.1-rc4上驗證過。

開發過程中使用了各種採集thermal pressure的手段。第一個方案，是把被溫控降低過的max frequency轉化成capacity，然後讓scheduler來用這個瞬時值來更新cpu_capacity。這個方案就是下面結果裡的“Instantaneous Thermal Pressure"。

第二種方案利用了PELT演算法來計算rt和dl的負載值，以及利用率，也用了PELT算出的avarage thermal pressure。這個方案在下表中被稱為"Thermal Pressure Averaging using PELT fmwk"。

第三種方案是基於decay period來對thermal pressure做衰減並求平均的演算法。這裡decay period衰減週期是可以配置的，也就是表格中的“Thermal Pressure Averaging non-PELT Algo. Decay : XXXms”。

測試結果看出第三種方案比起當前系統裡的預設方案來說，有3-5%的效能提高。

Hackbench: (1 group , 30000 loops, 10 runs)
ResultStandard Deviation
(Time Secs)(% of mean)

No Thermal Pressure10.217.99%

Instantaneous thermal pressure10.165.36%

Thermal Pressure Averaging
using PELT fmwk9.883.94%

Thermal Pressure Averaging
non-PELT Algo. Decay : 500 ms9.944.59%

Thermal Pressure Averaging
non-PELT Algo. Decay : 250 ms7.525.42%

Thermal Pressure Averaging
non-PELT Algo. Decay : 125 ms9.873.94%

Aobench: Size 2048 *2048
ResultStandard Deviation
(Time Secs)(% of mean)

No Thermal Pressure141.5815.85%

Instantaneous thermal pressure141.6315.03%

Thermal Pressure Averaging
using PELT fmwk134.4813.16%

Thermal Pressure Averaging
non-PELT Algo. Decay : 500 ms133.6213.00%

Thermal Pressure Averaging
non-PELT Algo. Decay : 250 ms137.2215.30%

Thermal Pressure Averaging
non-PELT Algo. Decay : 125 ms137.5513.26%

還運行了10次dhrystone測試，每次都有20個執行緒來跑500MLOOPS。測試結果只關注dhrystone的總執行時間，不關注每個processor的performance。結果如下：

Dhrystone Run Time

Result Standard Deviation

(Time Secs) (% of mean)

No Thermal Pressure1.14 10.04%

Instantaneous thermal pressure 1.15 9%

Thermal Pressure Averaging

using PELT fmwk 1.19 11.60%

Thermal Pressure Averaging

non-PELT Algo. Decay : 500 ms 1.09 7.51%

Thermal Pressure Averaging

non-PELT Algo. Decay : 250 ms 1.012 7.02%

Thermal Pressure Averaging

non-PELT Algo. Decay : 125 ms 1.12 9.02%

下面是V1 -> V2的改動，以及相關的改動檔案。

V1->V2: Removed using Pelt framework for thermal pressure accumulation

and averaging. Instead implemented a weighted average algorithm.

Thara Gopinath (3):

Calculate Thermal Pressure

sched/fair: update cpu_capcity to reflect thermal pressure

thermal/cpu-cooling: Update thermal pressure in case of a maximum

frequency capping

drivers/thermal/cpu_cooling.c |4 +

include/linux/sched/thermal.h |11 +++

kernel/sched/Makefile|2 +-

kernel/sched/fair.c|4 +

kernel/sched/thermal.c| 220 ++++++++++++++++++++++++++++++++++++++++++

5 files changed, 240 insertions(+), 1 deletion(-)

create mode 100644 include/linux/sched/thermal.h

create mode 100644 kernel/sched/thermal.c

LWN: Linaro提出thermal pressure的patch

您可能也會喜歡…