lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230106062946.19983-1-guohui@uniontech.com>
Date:   Fri,  6 Jan 2023 14:29:46 +0800
From:   Guo Hui <guohui@...ontech.com>
To:     sboyd@...nel.org
Cc:     tglx@...utronix.de, jstultz@...gle.com, wangxiaohua@...ontech.com,
        linux-kernel@...r.kernel.org, Guo Hui <guohui@...ontech.com>
Subject: [PATCH] timekeeping:add padding in timekeeper for Unixbench pipe

When the LLC cache line size is 128 bytes, such as Kunpeng 920,
the seq attribute and xtime attribute in the structure tk_core
are completely in the same LLC cache line,
and xtime_sec is the data protected by the seq lock
in the function ktime_get_coarse_real_ts64,
so seq and xtime_sec are in the same LLC cache line
causing the false sharing problem.

Adding padding before xtime_sec in the structure timekeeper
is based on the comment of the structure tk_read_base: "This
struct has size 56 byte on 64 bit. Together with a seqcount
it occupies a single 64byte cache line." Therefore,
seq and the structure tk_read_base
should be placed in the same 64-byte cacheline.

The performance data of Unixbench pipe on Kunpeng 920 is as follows:

Enable the LSE instruction:
seq and xtime are in the same LLC cache line:
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
Pipe Throughput                               12440.0   14800574.4  11897.6
Pipe-based Context Switching                   4000.0    4357419.0  10893.5
                                                                   ========
System Benchmarks Index Score (Partial Only)                        11384.5

seq and xtime are not in the same LLC cache line:
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
Pipe Throughput                               12440.0   16546306.6  13300.9
Pipe-based Context Switching                   4000.0    5654281.8  14135.7
                                                                   ========
System Benchmarks Index Score (Partial Only)                        13711.9

When the LSE instruction is enabled,
Pipe Throughput increases by 11.79%,
and Pipe-based Context Switching increases by 29.76%.

Close the LSE instruction:
seq and xtime are in the same LLC cache line:
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
Pipe Throughput                               12440.0   36375286.5  29240.6
Pipe-based Context Switching                   4000.0   11994739.7  29986.8
                                                                   ========
System Benchmarks Index Score (Partial Only)                        29611.4

seq and xtime are not in the same LLC cache line:
System Benchmarks Partial Index              BASELINE       RESULT    INDEX
Pipe Throughput                               12440.0   44887148.8  36082.9
Pipe-based Context Switching                   4000.0   13666392.0  34166.0
                                                                   ========
System Benchmarks Index Score (Partial Only)                        35111.4

When the LSE instruction is disabled,
Pipe Throughput increases by 23.40%,
and Pipe-based Context Switching increases by 13.94%.

Signed-off-by: Guo Hui <guohui@...ontech.com>
---
 include/linux/timekeeper_internal.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 84ff2844d..d363cd1f3 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -92,6 +92,7 @@ struct tk_read_base {
 struct timekeeper {
 	struct tk_read_base	tkr_mono;
 	struct tk_read_base	tkr_raw;
+	u64			padding;
 	u64			xtime_sec;
 	unsigned long		ktime_sec;
 	struct timespec64	wall_to_monotonic;
-- 
2.20.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ