lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Jan 2021 17:31:26 -0800
From:   Danny Lin <danny@...ag0n.dev>
To:     unlisted-recipients:; (no To-header on input)
Cc:     Danny Lin <danny@...ag0n.dev>, Andy Gross <agross@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Rob Herring <robh+dt@...nel.org>,
        linux-arm-msm@...r.kernel.org, devicetree@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: [PATCH] arm64: dts: qcom: sdm660: Fix CPU capacities

sdm660 has a big.LITTLE 4+4 CPU setup with CPUs 0-3 being little cores
and CPUs 4-7 being big cores. The big cores have higher IPC, so they
should have the higher capacity-dmips-mhz, not the other way around as
the device tree currently describes it. Fix the incorrect CPU map to
improve EAS scheduling behavior.

While we're at it, let's replace the old DMIPS/MHz values with new
measurements that reflect the exact IPC of the CPUs as reported by
CoreMark.

Performance measurements were made using my freqbench [1]
benchmark coordinator, which isolates, offlines, and disables the timer
tick on test CPUs to maximize accuracy. It uses EEMBC CoreMark [2] as
the workload and measures power usage using the PM660 PMIC's fuel
gauge.

Normalized DMIPS/MHz capacity scale values for each CPU were calculated
from CoreMarks/MHz (CoreMark iterations per second per MHz), which
serves the same purpose. For each CPU, the final capacity-dmips-mhz
value is the C/MHz value of its maximum frequency normalized to
SCHED_CAPACITY_SCALE (1024) for the fastest CPU in the system.

A Xiaomi Redmi Note 7 device running a downstream Qualcomm 4.4 kernel
was used for benchmarking to ensure proper frequency scaling and other
low-level controls.

Raw benchmark results can be found in the freqbench repository [3].
Below is a human-readable summary:

Frequency domains: cpu1 cpu4
Offline CPUs: cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7
Baseline power usage: 1130 mW

===== CPU 1 =====
Frequencies: 633 902 1113 1401 1536 1747 1843

 633:  2058     3.2 C/MHz     48 mW    5.9 J   42.6 I/mJ   121.5 s
 902:  2930     3.2 C/MHz     72 mW    6.2 J   40.6 I/mJ    85.3 s
1113:  3616     3.2 C/MHz     79 mW    5.4 J   46.0 I/mJ    69.1 s
1401:  4551     3.2 C/MHz    125 mW    6.9 J   36.3 I/mJ    54.9 s
1536:  4988     3.2 C/MHz    134 mW    6.7 J   37.1 I/mJ    50.1 s
1747:  5674     3.2 C/MHz    179 mW    7.9 J   31.7 I/mJ    44.1 s
1843:  5986     3.2 C/MHz    228 mW    9.5 J   26.3 I/mJ    41.8 s

===== CPU 4 =====
Frequencies: 1113 1401 1747 1958 2150 2208

1113:  5825     5.2 C/MHz    220 mW    9.4 J   26.5 I/mJ    42.9 s
1401:  7324     5.2 C/MHz    317 mW   10.8 J   23.1 I/mJ    34.1 s
1747:  9135     5.2 C/MHz    474 mW   13.0 J   19.2 I/mJ    27.4 s
1958: 10247     5.2 C/MHz    578 mW   14.1 J   17.7 I/mJ    24.4 s
2150: 11246     5.2 C/MHz    694 mW   15.4 J   16.2 I/mJ    22.2 s
2208: 11551     5.2 C/MHz    736 mW   15.9 J   15.7 I/mJ    21.7 s

[1] https://github.com/kdrag0n/freqbench
[2] https://www.eembc.org/coremark/
[3] https://github.com/kdrag0n/freqbench/tree/master/results/sdm660/main

Signed-off-by: Danny Lin <danny@...ag0n.dev>
---
 arch/arm64/boot/dts/qcom/sdm660.dtsi | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm660.dtsi b/arch/arm64/boot/dts/qcom/sdm660.dtsi
index 4abbdd03d1e7..ca985c5429db 100644
--- a/arch/arm64/boot/dts/qcom/sdm660.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm660.dtsi
@@ -40,7 +40,7 @@ CPU0: cpu@100 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x100>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <1024>;
+			capacity-dmips-mhz = <636>;
 			next-level-cache = <&L2_1>;
 			L2_1: l2-cache {
 				compatible = "cache";
@@ -59,7 +59,7 @@ CPU1: cpu@101 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x101>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <1024>;
+			capacity-dmips-mhz = <636>;
 			next-level-cache = <&L2_1>;
 			L1_I_101: l1-icache {
 				compatible = "cache";
@@ -74,7 +74,7 @@ CPU2: cpu@102 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x102>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <1024>;
+			capacity-dmips-mhz = <636>;
 			next-level-cache = <&L2_1>;
 			L1_I_102: l1-icache {
 				compatible = "cache";
@@ -89,7 +89,7 @@ CPU3: cpu@103 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x103>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <1024>;
+			capacity-dmips-mhz = <636>;
 			next-level-cache = <&L2_1>;
 			L1_I_103: l1-icache {
 				compatible = "cache";
@@ -104,7 +104,7 @@ CPU4: cpu@0 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x0>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <640>;
+			capacity-dmips-mhz = <1024>;
 			next-level-cache = <&L2_0>;
 			L2_0: l2-cache {
 				compatible = "cache";
@@ -123,7 +123,7 @@ CPU5: cpu@1 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x1>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <640>;
+			capacity-dmips-mhz = <1024>;
 			next-level-cache = <&L2_0>;
 			L1_I_1: l1-icache {
 				compatible = "cache";
@@ -138,7 +138,7 @@ CPU6: cpu@2 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x2>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <640>;
+			capacity-dmips-mhz = <1024>;
 			next-level-cache = <&L2_0>;
 			L1_I_2: l1-icache {
 				compatible = "cache";
@@ -153,7 +153,7 @@ CPU7: cpu@3 {
 			compatible = "qcom,kryo260";
 			reg = <0x0 0x3>;
 			enable-method = "psci";
-			capacity-dmips-mhz = <640>;
+			capacity-dmips-mhz = <1024>;
 			next-level-cache = <&L2_0>;
 			L1_I_3: l1-icache {
 				compatible = "cache";
-- 
2.29.2

Powered by blists - more mailing lists