lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220531105137.110050-5-krzysztof.kozlowski@linaro.org>
Date:   Tue, 31 May 2022 12:51:37 +0200
From:   Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>
To:     Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
        Andy Gross <agross@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Georgi Djakov <djakov@...nel.org>,
        Rob Herring <robh+dt@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>, linux-arm-msm@...r.kernel.org,
        linux-pm@...r.kernel.org, devicetree@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Cc:     Thara Gopinath <thara.gopinath@...aro.org>
Subject: [PATCH v3 4/4] arm64: dts: qcom: sdm845: Add CPU BWMON

Add device node for CPU-memory BWMON device (bandwidth monitoring) on
SDM845 measuring bandwidth between CPU (gladiator_noc) and Last Level
Cache (memnoc).  Usage of this BWMON allows to remove fixed bandwidth
votes from cpufreq (CPU nodes) thus achieve high memory throughput even
with lower CPU frequencies.

Performance impact (SDM845-MTP RB3 board, linux next-20220422):
1. No noticeable impact when running with schedutil or performance
   governors.

2. When comparing to customized kernel with synced interconnects and
   without bandwidth votes from CPU freq, the sysbench memory tests
   show significant improvement with bwmon for blocksizes past the L3
   cache.  The results for such superficial comparison:

sysbench memory test, results in MB/s (higher is better)
 bs kB |  type |    V  | V+no bw votes | bwmon | benefit %
     1 | W/seq | 14795 |          4816 |  4985 |      3.5%
    64 | W/seq | 41987 |         10334 | 10433 |      1.0%
  4096 | W/seq | 29768 |          8728 | 32007 |    266.7%
 65536 | W/seq | 17711 |          4846 | 18399 |    279.6%
262144 | W/seq | 16112 |          4538 | 17429 |    284.1%
    64 | R/seq | 61202 |         67092 | 66804 |     -0.4%
  4096 | R/seq | 23871 |          5458 | 24307 |    345.4%
 65536 | R/seq | 18554 |          4240 | 18685 |    340.7%
262144 | R/seq | 17524 |          4207 | 17774 |    322.4%
    64 | W/rnd |  2663 |          1098 |  1119 |      1.9%
 65536 | W/rnd |   600 |           316 |   610 |     92.7%
    64 | R/rnd |  4915 |          4784 |  4594 |     -4.0%
 65536 | R/rnd |   664 |           281 |   678 |    140.7%

Legend:
bs kB: block size in KB (small block size means only L1-3 caches are
      used
type: R - read, W - write, seq - sequential, rnd - random
V: vanilla (next-20220422)
V + no bw votes: vanilla without bandwidth votes from CPU freq
bwmon: bwmon without bandwidth votes from CPU freq
benefit %: difference between vanilla without bandwidth votes and bwmon
           (higher is better)

Co-developed-by: Thara Gopinath <thara.gopinath@...aro.org>
Signed-off-by: Thara Gopinath <thara.gopinath@...aro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 54 ++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 83e8b63f0910..adffb9c70566 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -2026,6 +2026,60 @@ llcc: system-cache-controller@...0000 {
 			interrupts = <GIC_SPI 582 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
+		pmu@...6400 {
+			compatible = "qcom,sdm845-cpu-bwmon";
+			reg = <0 0x01436400 0 0x600>;
+
+			interrupts = <GIC_SPI 581 IRQ_TYPE_LEVEL_HIGH>;
+
+			interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>,
+					<&osm_l3 MASTER_OSM_L3_APPS &osm_l3 SLAVE_OSM_L3>;
+			interconnect-names = "ddr", "l3c";
+
+			operating-points-v2 = <&cpu_bwmon_opp_table>;
+
+			cpu_bwmon_opp_table: opp-table {
+				compatible = "operating-points-v2";
+
+				/*
+				 * The interconnect paths bandwidths taken from
+				 * cpu4_opp_table bandwidth.
+				 * They also match different tables from
+				 * msm-4.9 downstream kernel:
+				 *  - the gladiator_noc-mem_noc from bandwidth
+				 *    table of qcom,llccbw (property qcom,bw-tbl);
+				 *    bus width: 4 bytes;
+				 *  - the OSM L3 from bandwidth table of
+				 *    qcom,cpu4-l3lat-mon (qcom,core-dev-table);
+				 *    bus width: 16 bytes;
+				 */
+				opp-0 {
+					opp-peak-kBps = <800000 4800000>;
+				};
+				opp-1 {
+					opp-peak-kBps = <1804000 9216000>;
+				};
+				opp-2 {
+					opp-peak-kBps = <2188000 11980800>;
+				};
+				opp-3 {
+					opp-peak-kBps = <3072000 15052800>;
+				};
+				opp-4 {
+					opp-peak-kBps = <4068000 19353600>;
+				};
+				opp-5 {
+					opp-peak-kBps = <5412000 20889600>;
+				};
+				opp-6 {
+					opp-peak-kBps = <6220000 22425600>;
+				};
+				opp-7 {
+					opp-peak-kBps = <7216000 25497600>;
+				};
+			};
+		};
+
 		pcie0: pci@...0000 {
 			compatible = "qcom,pcie-sdm845";
 			reg = <0 0x01c00000 0 0x2000>,
-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ