linux-kernel - [RFC 0/2] writeback: add support for filesystems to optimize parallel writeback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250914121109.36403-1-wangyufei@vivo.com>
Date: Sun, 14 Sep 2025 20:11:07 +0800
From: wangyufei <wangyufei@...o.com>
To: viro@...iv.linux.org.uk,
	brauner@...nel.org,
	jack@...e.cz,
	cem@...nel.org
Cc: kundan.kumar@...sung.com,
	anuj20.g@...sung.com,
	hch@....de,
	bernd@...ernd.com,
	djwong@...nel.org,
	david@...morbit.com,
	linux-kernel@...r.kernel.org,
	linux-xfs@...r.kernel.org,
	linux-fsdevel@...r.kernel.org,
	opensource.kernel@...o.com,
	wangyufei <wangyufei@...o.com>
Subject: [RFC 0/2] writeback: add support for filesystems to optimize parallel writeback

Based on this parallel writeback testing on XFS [1] and prior discussions,
we believe that the features and architecture of filesystems must be
considered to optimize parallel writeback performance.

We introduce a filesystem interface to control the assignment of inodes
to writeback contexts based on the following insights:
- Following Dave's earlier suggestion [2], filesystems should determine
both the number of writeback contexts and how inodes are assigned to them.
Therefore, we provide an interface for filesystems to customize their
inode assignment strategy for writeback.
- Instead of dynamically changing the number of writeback contexts during
filesystem initialization, we allow filesystems to determine how many
contexts it require, and push inodes only to those designated contexts.

To implement this, we have made the following changes:
- Introduces get_inode_wb_ctx_idx() in super_operations, called from
fetch_bdi_writeback_ctx(), allowing filesystems to provide a writeback 
context index for an inode. This generic interface can be extended to 
all filesystems.
- Implements XFS adaptation. To address contention during delayed
allocation, all inodes from the same Allocation Group bind to a unique
writeback context.

Through this testing [1], we obtained the following results. Our approach
achieves performance similar to nr_wb_ctx=4 but shows no further
improvement. After collecting perf data, the results show that lock
contention during delayed allocation remains unresolved.

System config:
Number of CPUs = 8
System RAM = 4G
For XFS number of AGs = 4
Used NVMe SSD of 20GB (emulated via QEMU)

Result:

Default:
Parallel Writeback (nr_wb_ctx = 1)    :  16.4MiB/s
Parallel Writeback (nr_wb_ctx = 2)    :  32.3MiB/s
Parallel Writeback (nr_wb_ctx = 3)    :  39.0MiB/s
Parallel Writeback (nr_wb_ctx = 4)    :  47.3MiB/s
Parallel Writeback (nr_wb_ctx = 5)    :  45.7MiB/s
Parallel Writeback (nr_wb_ctx = 6)    :  46.0MiB/s
Parallel Writeback (nr_wb_ctx = 7)    :  42.7MiB/s
Parallel Writeback (nr_wb_ctx = 8)    :  40.8MiB/s

After optimization (4 AGs utilized):
Parallel Writeback (nr_wb_ctx = 8)    :  47.1MiB/s (4 active contexts)

These results lead to the following discussions:
1. How can we design workloads that better expose the lock contention of 
delay allocation?
2. Given the lack of performance improvements, is there an oversight or 
misunderstanding of the implementation of the xfs interface, or is there 
some other performance bottleneck?

[1] 
https://lore.kernel.org/linux-fsdevel/CALYkqXpOBb1Ak2kEKWbO2Kc5NaGwb4XsX1q4eEaNWmO_4SQq9w@mail.gmail.com/
[2] 
https://lore.kernel.org/linux-fsdevel/Z5qw_1BOqiFum5Dn@dread.disaster.area/

wangyufei (2):
  writeback: add support for filesystems to affine inodes to specific
    writeback ctx
  xfs: implement get_inode_wb_ctx_idx() for per-AG parallel writeback

 fs/xfs/xfs_super.c          | 14 ++++++++++++++
 include/linux/backing-dev.h |  3 +++
 include/linux/fs.h          |  1 +
 3 files changed, 18 insertions(+)

-- 
2.34.1