[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250914121109.36403-1-wangyufei@vivo.com>
Date: Sun, 14 Sep 2025 20:11:07 +0800
From: wangyufei <wangyufei@...o.com>
To: viro@...iv.linux.org.uk,
brauner@...nel.org,
jack@...e.cz,
cem@...nel.org
Cc: kundan.kumar@...sung.com,
anuj20.g@...sung.com,
hch@....de,
bernd@...ernd.com,
djwong@...nel.org,
david@...morbit.com,
linux-kernel@...r.kernel.org,
linux-xfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org,
opensource.kernel@...o.com,
wangyufei <wangyufei@...o.com>
Subject: [RFC 0/2] writeback: add support for filesystems to optimize parallel writeback
Based on this parallel writeback testing on XFS [1] and prior discussions,
we believe that the features and architecture of filesystems must be
considered to optimize parallel writeback performance.
We introduce a filesystem interface to control the assignment of inodes
to writeback contexts based on the following insights:
- Following Dave's earlier suggestion [2], filesystems should determine
both the number of writeback contexts and how inodes are assigned to them.
Therefore, we provide an interface for filesystems to customize their
inode assignment strategy for writeback.
- Instead of dynamically changing the number of writeback contexts during
filesystem initialization, we allow filesystems to determine how many
contexts it require, and push inodes only to those designated contexts.
To implement this, we have made the following changes:
- Introduces get_inode_wb_ctx_idx() in super_operations, called from
fetch_bdi_writeback_ctx(), allowing filesystems to provide a writeback
context index for an inode. This generic interface can be extended to
all filesystems.
- Implements XFS adaptation. To address contention during delayed
allocation, all inodes from the same Allocation Group bind to a unique
writeback context.
Through this testing [1], we obtained the following results. Our approach
achieves performance similar to nr_wb_ctx=4 but shows no further
improvement. After collecting perf data, the results show that lock
contention during delayed allocation remains unresolved.
System config:
Number of CPUs = 8
System RAM = 4G
For XFS number of AGs = 4
Used NVMe SSD of 20GB (emulated via QEMU)
Result:
Default:
Parallel Writeback (nr_wb_ctx = 1) : 16.4MiB/s
Parallel Writeback (nr_wb_ctx = 2) : 32.3MiB/s
Parallel Writeback (nr_wb_ctx = 3) : 39.0MiB/s
Parallel Writeback (nr_wb_ctx = 4) : 47.3MiB/s
Parallel Writeback (nr_wb_ctx = 5) : 45.7MiB/s
Parallel Writeback (nr_wb_ctx = 6) : 46.0MiB/s
Parallel Writeback (nr_wb_ctx = 7) : 42.7MiB/s
Parallel Writeback (nr_wb_ctx = 8) : 40.8MiB/s
After optimization (4 AGs utilized):
Parallel Writeback (nr_wb_ctx = 8) : 47.1MiB/s (4 active contexts)
These results lead to the following discussions:
1. How can we design workloads that better expose the lock contention of
delay allocation?
2. Given the lack of performance improvements, is there an oversight or
misunderstanding of the implementation of the xfs interface, or is there
some other performance bottleneck?
[1]
https://lore.kernel.org/linux-fsdevel/CALYkqXpOBb1Ak2kEKWbO2Kc5NaGwb4XsX1q4eEaNWmO_4SQq9w@mail.gmail.com/
[2]
https://lore.kernel.org/linux-fsdevel/Z5qw_1BOqiFum5Dn@dread.disaster.area/
wangyufei (2):
writeback: add support for filesystems to affine inodes to specific
writeback ctx
xfs: implement get_inode_wb_ctx_idx() for per-AG parallel writeback
fs/xfs/xfs_super.c | 14 ++++++++++++++
include/linux/backing-dev.h | 3 +++
include/linux/fs.h | 1 +
3 files changed, 18 insertions(+)
--
2.34.1
Powered by blists - more mailing lists