lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Tue, 28 May 2024 18:45:16 +0200
From: Dragan Simic <dsimic@...jaro.org>
To: linux-sunxi@...ts.linux.dev
Cc: wens@...e.org,
	jernej.skrabec@...il.com,
	samuel@...lland.org,
	linux-arm-kernel@...ts.infradead.org,
	devicetree@...r.kernel.org,
	robh+dt@...nel.org,
	krzk+dt@...nel.org,
	conor+dt@...nel.org,
	linux-kernel@...r.kernel.org,
	Andre Przywara <andre.przywara@....com>
Subject: [PATCH v2] arm64: dts: allwinner: Add cache information to the SoC dtsi for H616

Add missing cache information to the Allwinner H616 SoC dtsi, to allow
the userspace, which includes lscpu(1) that uses the virtual files provided
by the kernel under the /sys/devices/system/cpu directory, to display the
proper H616 cache information.

Adding the cache information to the H616 SoC dtsi also makes the following
warning message in the kernel log go away:

  cacheinfo: Unable to detect cache hierarchy for CPU 0

Rather conspicuously, almost no cache-related information is available in
the publicly available Allwinner H616 datasheet (version 1.0) and H616 user
manual (version 1.0).  Thus, the cache parameters for the H616 SoC dtsi were
obtained and derived by hand from the cache size and layout specifications
found in the following technical reference manual, and from the cache size
and die revision hints available from the following community-provided data
and memory subsystem benchmarks:

  - ARM Cortex-A53 revision r0p4 TRM, version J
  - Summary of the two available H616 die revisions and their differences
    in cache sizes observed from the CSSIDR_EL1 register readouts, provided
    by Andre Przywara [1][2]
  - Tinymembench benchmark results of the H616-based OrangePi Zero 2 SBC,
    provided by Thomas Kaiser [3]

For future reference, here's a brief summary of the available documentation
and the community-provided data and memory subsystem benchmarks:

  - All caches employ the 64-byte cache line length
  - Each Cortex-A53 core has 32 KB of L1 2-way, set-associative instruction
    cache and 32 KB of L1 4-way, set-associative data cache
  - The size of the L2 cache depends on the actual H616 die revision (there
    are two die revisions), so the entire SoC can have either 256 KB or 1 MB
    of unified L2 16-way, set-associative cache [1]

Also for future reference, here's the relevant excerpt from the community-
provided H616 memory subsystem benchmark, [3] which confirms that 32 KB and
256 KB are the L1 data and L2 cache sizes, respectively:

    block size : single random read / dual random read
          1024 :    0.0 ns          /     0.0 ns
          2048 :    0.0 ns          /     0.0 ns
          4096 :    0.0 ns          /     0.0 ns
          8192 :    0.0 ns          /     0.0 ns
         16384 :    0.0 ns          /     0.0 ns
         32768 :    0.0 ns          /     0.0 ns
         65536 :    4.3 ns          /     7.3 ns
        131072 :    6.6 ns          /    10.5 ns
        262144 :    9.8 ns          /    15.2 ns
        524288 :   91.8 ns          /   142.9 ns
       1048576 :  138.6 ns          /   188.3 ns
       2097152 :  163.0 ns          /   204.8 ns
       4194304 :  178.8 ns          /   213.5 ns
       8388608 :  187.1 ns          /   217.9 ns
      16777216 :  192.2 ns          /   220.9 ns
      33554432 :  196.5 ns          /   224.0 ns
      67108864 :  215.7 ns          /   259.5 ns

The changes introduced to the H616 SoC dtsi by this patch specify 256 KB as
the L2 cache size.  As outlined by Andre Przywara, [2] a follow-up TF-A patch
will perform runtime adjustment of the device tree data, making the correct
L2 cache size of 1 MB present in the device tree for the boards based on the
revision of H616 that actually provides 1 MB of L2 cache.

[1] https://lore.kernel.org/linux-sunxi/20240430114627.0cfcd14a@donnerap.manchester.arm.com/
[2] https://lore.kernel.org/linux-sunxi/20240501103059.10a8f7de@donnerap.manchester.arm.com/
[3] https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/results/4knM.txt

Suggested-by: Andre Przywara <andre.przywara@....com>
Helped-by: Andre Przywara <andre.przywara@....com>
Reviewed-by: Andre Przywara <andre.przywara@....com>
Signed-off-by: Dragan Simic <dsimic@...jaro.org>
---

Notes:
    Link to v1: https://lore.kernel.org/linux-sunxi/9d52e6d338a059618d894abb0764015043330c2b.1714727227.git.dsimic@manjaro.org/T/#u
    
    Changes in v2:
      - Collected one Reviewed-by tag
      - Rebased the patch to 6.10-rc1, as requested by Chen-Yu, [4] with
        no functional changes introduced
    
    [4] https://lore.kernel.org/linux-sunxi/CAGb2v67_4MHEZned0X1sFxisySahemHYo6sjn9sttQY+RO=VQw@mail.gmail.com/

 .../arm64/boot/dts/allwinner/sun50i-h616.dtsi | 37 +++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
index 921d5f61d8d6..6595e0406b6d 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h616.dtsi
@@ -27,33 +27,70 @@ cpu0: cpu@0 {
 			enable-method = "psci";
 			clocks = <&ccu CLK_CPUX>;
 			#cooling-cells = <2>;
+			i-cache-size = <0x8000>;
+			i-cache-line-size = <64>;
+			i-cache-sets = <256>;
+			d-cache-size = <0x8000>;
+			d-cache-line-size = <64>;
+			d-cache-sets = <128>;
+			next-level-cache = <&l2_cache>;
 		};
 
 		cpu1: cpu@1 {
 			compatible = "arm,cortex-a53";
 			device_type = "cpu";
 			reg = <1>;
 			enable-method = "psci";
 			clocks = <&ccu CLK_CPUX>;
 			#cooling-cells = <2>;
+			i-cache-size = <0x8000>;
+			i-cache-line-size = <64>;
+			i-cache-sets = <256>;
+			d-cache-size = <0x8000>;
+			d-cache-line-size = <64>;
+			d-cache-sets = <128>;
+			next-level-cache = <&l2_cache>;
 		};
 
 		cpu2: cpu@2 {
 			compatible = "arm,cortex-a53";
 			device_type = "cpu";
 			reg = <2>;
 			enable-method = "psci";
 			clocks = <&ccu CLK_CPUX>;
 			#cooling-cells = <2>;
+			i-cache-size = <0x8000>;
+			i-cache-line-size = <64>;
+			i-cache-sets = <256>;
+			d-cache-size = <0x8000>;
+			d-cache-line-size = <64>;
+			d-cache-sets = <128>;
+			next-level-cache = <&l2_cache>;
 		};
 
 		cpu3: cpu@3 {
 			compatible = "arm,cortex-a53";
 			device_type = "cpu";
 			reg = <3>;
 			enable-method = "psci";
 			clocks = <&ccu CLK_CPUX>;
 			#cooling-cells = <2>;
+			i-cache-size = <0x8000>;
+			i-cache-line-size = <64>;
+			i-cache-sets = <256>;
+			d-cache-size = <0x8000>;
+			d-cache-line-size = <64>;
+			d-cache-sets = <128>;
+			next-level-cache = <&l2_cache>;
+		};
+
+		l2_cache: l2-cache {
+			compatible = "cache";
+			cache-level = <2>;
+			cache-unified;
+			cache-size = <0x40000>;
+			cache-line-size = <64>;
+			cache-sets = <256>;
 		};
 	};
 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ