lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241024060954.443574-1-jimzhao.ai@gmail.com>
Date: Thu, 24 Oct 2024 14:09:54 +0800
From: Jim Zhao <jimzhao.ai@...il.com>
To: akpm@...ux-foundation.org
Cc: jimzhao.ai@...il.com,
	linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	willy@...radead.org
Subject: Re: [PATCH] mm/page-writeback: Raise wb_thresh to prevent write blocking with strictlimit

> On Wed, 23 Oct 2024 18:00:32 +0800 Jim Zhao <jimzhao.ai@...il.com> wrote:

> > With the strictlimit flag, wb_thresh acts as a hard limit in
> > balance_dirty_pages() and wb_position_ratio(). When device write
> > operations are inactive, wb_thresh can drop to 0, causing writes to
> > be blocked. The issue occasionally occurs in fuse fs, particularly
> > with network backends, the write thread is blocked frequently during
> > a period. To address it, this patch raises the minimum wb_thresh to a
> > controllable level, similar to the non-strictlimit case.

> Please tell us more about the userspace-visible effects of this.  It
> *sounds* like a serious (but occasional) problem, but that is unclear.

> And, very much relatedly, do you feel this fix is needed in earlier
> (-stable) kernels?

The problem exists in two scenarios:
1. FUSE Write Transition from Inactive to Active

sometimes, active writes require several pauses to ramp up to the appropriate wb_thresh.
As shown in the trace below, both bdi_setpoint and task_ratelimit are 0, means wb_thresh is 0. 
The dd process pauses multiple times before reaching a normal state.

dd-1206590 [003] .... 62988.324049: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259360 dirty=454 bdi_setpoint=0 bdi_dirty=32 dirty_ratelimit=18716 task_ratelimit=0 dirtied=32 dirtied_pause=32 paused=0 pause=4 period=4 think=0 cgroup_ino=1
dd-1206590 [003] .... 62988.332063: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259453 dirty=454 bdi_setpoint=0 bdi_dirty=33 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
dd-1206590 [003] .... 62988.340064: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259526 dirty=454 bdi_setpoint=0 bdi_dirty=34 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
dd-1206590 [003] .... 62988.348061: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259531 dirty=489 bdi_setpoint=0 bdi_dirty=35 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
dd-1206590 [003] .... 62988.356063: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259531 dirty=490 bdi_setpoint=0 bdi_dirty=36 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
...

2. FUSE with Unstable Network Backends and Occasional Writes
Not easy to reproduce, but when it occurs in this scenario, 
it causes the write thread to experience more pauses and longer durations.


Currently, some code is in place to improve this situation, but seems insufficient:
if (dtc->wb_dirty < 8)
{
	// ...
}

So the patch raise min wb_thresh to keep the occasional writes won't be blocked and
active writes can rampup the threshold quickly.

--

Thanks,
Jim Zhao


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ