lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <683632b425dc2_3e701009c@dwillia2-xfh.jf.intel.com.notmuch>
Date: Tue, 27 May 2025 14:46:28 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Alistair Popple <apopple@...dia.com>, <akpm@...ux-foundation.org>,
	<linux-fsdevel@...r.kernel.org>, <nvdimm@...ts.linux.dev>,
	<dan.j.williams@...el.com>, <willy@...radead.org>
CC: <linux-kernel@...r.kernel.org>, Alistair Popple <apopple@...dia.com>,
	Alison Schofield <alison.schofield@...el.com>, Balbir Singh
	<balbirs@...dia.com>, "Darrick J. Wong" <djwong@...nel.org>, Dave Chinner
	<david@...morbit.com>, David Hildenbrand <david@...hat.com>, Jan Kara
	<jack@...e.cz>, John Hubbard <jhubbard@...dia.com>, Ted Ts'o <tytso@....edu>,
	Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner
	<brauner@...nel.org>
Subject: Re: [PATCH] fs/dax: Fix "don't skip locked entries when scanning
 entries"

Alistair Popple wrote:
> Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
> entries") introduced a new function, wait_entry_unlocked_exclusive(),
> which waits for the current entry to become unlocked without advancing
> the XArray iterator state.
> 
> Waiting for the entry to become unlocked requires dropping the XArray
> lock. This requires calling xas_pause() prior to dropping the lock
> which leaves the xas in a suitable state for the next iteration. However
> this has the side-effect of advancing the xas state to the next index.
> Normally this isn't an issue because xas_for_each() contains code to
> detect this state and thus avoid advancing the index a second time on
> the next loop iteration.
> 
> However both callers of and wait_entry_unlocked_exclusive() itself
> subsequently use the xas state to reload the entry. As xas_pause()
> updated the state to the next index this will cause the current entry
> which is being waited on to be skipped. This caused the following
> warning to fire intermittently when running xftest generic/068 on an XFS
> filesystem with FS DAX enabled:
> 
> [   35.067397] ------------[ cut here ]------------
> [   35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xd8/0x1e0
[..]
> 
> Fix this by using xas_reset() instead, which is equivalent in
> implementation to xas_pause() but does not advance the XArray state.
> 
> Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries")
> Signed-off-by: Alistair Popple <apopple@...dia.com>
[..]
> 
> Hi Andrew,
> 
> Apologies for finding this so late in the cycle. This is a very
> intermittent issue for me, and it seems it was only exposed by a recent
> upgrade to my test machine/setup. The user visible impact is the same
> as for the original commit this fixes. That is possible file data
> corruption if a device has a FS DAX page pinned for DMA.
> 
> So in other words it means my original fix was not 100% effective.
> The issue that commit fixed has existed for a long time without being
> reported, so not sure if this is worth trying to get into v6.15 or not.
> 
> Either way I figured it would be best to send this ASAP, which means I
> am still waiting for a complete xfstest run to complete (although the
> failing test does now pass cleanly).
> ---
>  fs/dax.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 676303419e9e..f8d8b1afd232 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -257,7 +257,7 @@ static void *wait_entry_unlocked_exclusive(struct xa_state *xas, void *entry)
>  		wq = dax_entry_waitqueue(xas, entry, &ewait.key);
>  		prepare_to_wait_exclusive(wq, &ewait.wait,
>  					TASK_UNINTERRUPTIBLE);
> -		xas_pause(xas);
> +		xas_reset(xas);
>  		xas_unlock_irq(xas);
>  		schedule();
>  		finish_wait(wq, &ewait.wait);

This looks super-subtle, but so did the original fix commit 6be3e21d25ca
("fs/dax: don't skip locked entries when scanning entries"). The
resolution is the same to make sure the xarray state does not mistakenly
advance when the lock is dropped.

You can add:

Reviewed-by: Dan Williams <dan.j.williams@...el.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ