[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230106062946.19983-1-guohui@uniontech.com>
Date: Fri, 6 Jan 2023 14:29:46 +0800
From: Guo Hui <guohui@...ontech.com>
To: sboyd@...nel.org
Cc: tglx@...utronix.de, jstultz@...gle.com, wangxiaohua@...ontech.com,
linux-kernel@...r.kernel.org, Guo Hui <guohui@...ontech.com>
Subject: [PATCH] timekeeping:add padding in timekeeper for Unixbench pipe
When the LLC cache line size is 128 bytes, such as Kunpeng 920,
the seq attribute and xtime attribute in the structure tk_core
are completely in the same LLC cache line,
and xtime_sec is the data protected by the seq lock
in the function ktime_get_coarse_real_ts64,
so seq and xtime_sec are in the same LLC cache line
causing the false sharing problem.
Adding padding before xtime_sec in the structure timekeeper
is based on the comment of the structure tk_read_base: "This
struct has size 56 byte on 64 bit. Together with a seqcount
it occupies a single 64byte cache line." Therefore,
seq and the structure tk_read_base
should be placed in the same 64-byte cacheline.
The performance data of Unixbench pipe on Kunpeng 920 is as follows:
Enable the LSE instruction:
seq and xtime are in the same LLC cache line:
System Benchmarks Partial Index BASELINE RESULT INDEX
Pipe Throughput 12440.0 14800574.4 11897.6
Pipe-based Context Switching 4000.0 4357419.0 10893.5
========
System Benchmarks Index Score (Partial Only) 11384.5
seq and xtime are not in the same LLC cache line:
System Benchmarks Partial Index BASELINE RESULT INDEX
Pipe Throughput 12440.0 16546306.6 13300.9
Pipe-based Context Switching 4000.0 5654281.8 14135.7
========
System Benchmarks Index Score (Partial Only) 13711.9
When the LSE instruction is enabled,
Pipe Throughput increases by 11.79%,
and Pipe-based Context Switching increases by 29.76%.
Close the LSE instruction:
seq and xtime are in the same LLC cache line:
System Benchmarks Partial Index BASELINE RESULT INDEX
Pipe Throughput 12440.0 36375286.5 29240.6
Pipe-based Context Switching 4000.0 11994739.7 29986.8
========
System Benchmarks Index Score (Partial Only) 29611.4
seq and xtime are not in the same LLC cache line:
System Benchmarks Partial Index BASELINE RESULT INDEX
Pipe Throughput 12440.0 44887148.8 36082.9
Pipe-based Context Switching 4000.0 13666392.0 34166.0
========
System Benchmarks Index Score (Partial Only) 35111.4
When the LSE instruction is disabled,
Pipe Throughput increases by 23.40%,
and Pipe-based Context Switching increases by 13.94%.
Signed-off-by: Guo Hui <guohui@...ontech.com>
---
include/linux/timekeeper_internal.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 84ff2844d..d363cd1f3 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -92,6 +92,7 @@ struct tk_read_base {
struct timekeeper {
struct tk_read_base tkr_mono;
struct tk_read_base tkr_raw;
+ u64 padding;
u64 xtime_sec;
unsigned long ktime_sec;
struct timespec64 wall_to_monotonic;
--
2.20.1
Powered by blists - more mailing lists