lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zqa8NTqKuXkTxzBw@casper.infradead.org>
Date: Sun, 28 Jul 2024 22:46:29 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Gao Xiang <hsiangkao@...ux.alibaba.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Huang Ying <ying.huang@...el.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/migrate: fix deadlock in migrate_pages_batch() on
 large folios

On Sun, Jul 28, 2024 at 11:49:13PM +0800, Gao Xiang wrote:
> It was found by compaction stress test when I explicitly enable EROFS
> compressed files to use large folios, which case I cannot reproduce with
> the same workload if large folio support is off (current mainline).
> Typically, filesystem reads (with locked file-backed folios) could use
> another bdev/meta inode to load some other I/Os (e.g. inode extent
> metadata or caching compressed data), so the locking order will be:

Umm.  That is a new constraint to me.  We have two other places which
take the folio lock in a particular order.  Writeback takes locks on
folios belonging to the same inode in ascending ->index order.  It
submits all the folios for write before moving on to lock other inodes,
so it does not conflict with this new constraint you're proposing.

The other place is remap_file_range().  Both inodes in that case must be
regular files,
        if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
                return -EINVAL;
so this new rule is fine.

Does anybody know of any _other_ ordering constraints on folio locks?  I'm
willing to write them down ...

> diff --git a/mm/migrate.c b/mm/migrate.c
> index 20cb9f5f7446..a912e4b83228 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1483,7 +1483,8 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f
>  {
>  	int rc;
>  
> -	folio_lock(folio);
> +	if (!folio_trylock(folio))
> +		return -EAGAIN;
>  	rc = split_folio_to_list(folio, split_folios);
>  	folio_unlock(folio);
>  	if (!rc)

This feels like the best quick fix to me since migration is going to
walk the folios in a different order from writeback.  I'm surprised
this hasn't already bitten us, to be honest.

(ie I don't think this is even necessarily connected to the new
ordering constraint; I think migration and writeback can already
deadlock)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ