[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230628105624.150352-1-lipeng.zhu@intel.com>
Date: Wed, 28 Jun 2023 18:56:25 +0800
From: "Zhu, Lipeng" <lipeng.zhu@...el.com>
To: akpm@...ux-foundation.org, viro@...iv.linux.org.uk,
brauner@...nel.org
Cc: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, pan.deng@...el.com, yu.ma@...el.com,
tianyou.li@...el.com, tim.c.chen@...ux.intel.com,
"Zhu, Lipeng" <lipeng.zhu@...el.com>
Subject: [PATCH] fs/address_space: add alignment padding for i_map and i_mmap_rwsem to mitigate a false sharing.
When running UnixBench/Shell Scripts, we observed high false sharing
for accessing i_mmap against i_mmap_rwsem.
UnixBench/Shell Scripts are typical load/execute command test scenarios,
the i_mmap will be accessed frequently to insert/remove vma_interval_tree.
Meanwhile, the i_mmap_rwsem is frequently loaded. Unfortunately, they are
in the same cacheline.
The patch places the i_mmap and i_mmap_rwsem in separate cache lines to avoid
this false sharing problem.
With this patch, on Intel Sapphire Rapids 2 sockets 112c/224t platform, based
on kernel v6.4-rc4, the 224 parallel score is improved ~2.5% for
UnixBench/Shell Scripts case. And perf c2c tool shows the false sharing is
resolved as expected, the symbol vma_interval_tree_remove disappeared in
cache line 0 after this change.
Baseline:
=================================================
Shared Cache Line Distribution Pareto
=================================================
-------------------------------------------------------------
0 13642 19392 9012 63 0xff1ddd3f0c8a3b00
-------------------------------------------------------------
9.22% 7.37% 0.00% 0.00% 0x0 0 1 0xffffffffab344052 518 334 354 5490 160 [k] vma_interval_tree_remove [kernel.kallsyms] vma_interval_tree_remove+18 0 1
0.71% 0.73% 0.00% 0.00% 0x8 0 1 0xffffffffabb9a21f 574 338 458 1991 160 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1
0.52% 0.71% 5.34% 6.35% 0x8 0 1 0xffffffffabb9a236 1080 597 390 4848 160 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1
0.56% 0.47% 26.39% 6.35% 0x8 0 1 0xffffffffabb9a5ec 1327 1037 587 8537 160 [k] down_write [kernel.kallsyms] down_write+28 0 1
0.11% 0.08% 15.72% 1.59% 0x8 0 1 0xffffffffab17082b 1618 1077 735 7303 160 [k] up_write [kernel.kallsyms] up_write+27 0 1
0.01% 0.02% 0.08% 0.00% 0x8 0 1 0xffffffffabb9a27d 1594 593 512 53 43 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+749 0 1
0.00% 0.01% 0.00% 0.00% 0x8 0 1 0xffffffffabb9a0c4 0 323 518 97 74 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1
44.74% 49.78% 0.00% 0.00% 0x10 0 1 0xffffffffab170995 609 344 430 26841 160 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1
26.62% 22.39% 0.00% 0.00% 0x10 0 1 0xffffffffab170965 514 347 437 13364 160 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1
With this change:
-------------------------------------------------------------
0 12726 18554 9039 49 0xff157a0f25b90c40
-------------------------------------------------------------
0.90% 0.72% 0.00% 0.00% 0x0 1 1 0xffffffffa5f9a21f 532 353 461 2200 160 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1
0.53% 0.70% 5.16% 6.12% 0x0 1 1 0xffffffffa5f9a236 1196 670 403 4774 160 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1
0.68% 0.51% 25.91% 6.12% 0x0 1 1 0xffffffffa5f9a5ec 1049 807 540 8552 160 [k] down_write [kernel.kallsyms] down_write+28 0 1
0.09% 0.06% 16.50% 2.04% 0x0 1 1 0xffffffffa557082b 1693 1351 758 7317 160 [k] up_write [kernel.kallsyms] up_write+27 0 1
0.01% 0.00% 0.00% 0.00% 0x0 1 1 0xffffffffa5f9a0c4 543 0 491 89 68 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1
0.00% 0.01% 0.02% 0.00% 0x0 1 1 0xffffffffa5f9a27d 0 597 742 45 40 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+749 0 1
49.29% 53.01% 0.00% 0.00% 0x8 1 1 0xffffffffa5570995 580 310 413 27106 160 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1
28.60% 24.12% 0.00% 0.00% 0x8 1 1 0xffffffffa5570965 490 321 419 13244 160 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1
Reviewed-by: Tim Chen <tim.c.chen@...ux.intel.com>
Signed-off-by: Lipeng Zhu <lipeng.zhu@...el.com>
---
include/linux/fs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c85916e9f7db..d3dd8dcc9b8b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -434,7 +434,7 @@ struct address_space {
atomic_t nr_thps;
#endif
struct rb_root_cached i_mmap;
- struct rw_semaphore i_mmap_rwsem;
+ struct rw_semaphore i_mmap_rwsem ____cacheline_aligned_in_smp;
unsigned long nrpages;
pgoff_t writeback_index;
const struct address_space_operations *a_ops;
--
2.39.1
Powered by blists - more mailing lists