linux-kernel - Re: [PATCH] mm/migrate: fix deadlock in migrate_pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zqa1ZZrrlp5jHElW@casper.infradead.org>
Date: Sun, 28 Jul 2024 22:17:25 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Gao Xiang <hsiangkao@...ux.alibaba.com>,
	Huang Ying <ying.huang@...el.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/migrate: fix deadlock in migrate_pages_batch() on
 large folios

On Sun, Jul 28, 2024 at 12:50:05PM -0700, Andrew Morton wrote:
> On Sun, 28 Jul 2024 23:49:13 +0800 Gao Xiang <hsiangkao@...ux.alibaba.com> wrote:
> > Currently, migrate_pages_batch() can lock multiple locked folios
> > with an arbitrary order.  Although folio_trylock() is used to avoid
> > deadlock as commit 2ef7dbb26990 ("migrate_pages: try migrate in batch
> > asynchronously firstly") mentioned, it seems try_split_folio() is
> > still missing.
> 
> Am I correct in believing that folio_lock() doesn't have lockdep coverage?

Yes.  It can't; it is taken in process context and released by whatever
context the read completion happens in (could be hard/soft irq, could be
a workqueue, could be J. Random kthread, depending on the device driver)
So it doesn't match the lockdep model at all.

> > It was found by compaction stress test when I explicitly enable EROFS
> > compressed files to use large folios, which case I cannot reproduce with
> > the same workload if large folio support is off (current mainline).
> > Typically, filesystem reads (with locked file-backed folios) could use
> > another bdev/meta inode to load some other I/Os (e.g. inode extent
> > metadata or caching compressed data), so the locking order will be:
> 
> Which kernels need fixing.  Do we expect that any code paths in 6.10 or
> earlier are vulnerable to this?

I would suggest it goes back to the introduction of large folios, but
that's just a gut feeling based on absolutely no reading of code or
inspection of git history.