linux-kernel - Re: [RFC] mm/migrate: make sure folio_unlock() before folio_wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <dglxbwe2i5ubofefdxwo5jvyhdfjov37z5jzc5guedhe4dl6ia@pmkjkec3isb4>
Date: Fri, 3 Oct 2025 15:04:50 +0100
From: Pedro Falcato <pfalcato@...e.de>
To: David Hildenbrand <david@...hat.com>
Cc: Byungchul Park <byungchul@...com>, akpm@...ux-foundation.org, 
	ziy@...dia.com, matthew.brost@...el.com, joshua.hahnjy@...il.com, 
	rakie.kim@...com, gourry@...rry.net, ying.huang@...ux.alibaba.com, 
	apopple@...dia.com, clameter@....com, kravetz@...ibm.com, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, max.byungchul.park@...il.com, kernel_team@...ynix.com, 
	harry.yoo@...cle.com, gwan-gyeong.mun@...el.com, yeoreum.yun@....com, 
	syzkaller@...glegroups.com, ysk@...lloc.com, Matthew Wilcox <willy@...radead.org>, 
	linux-ext4@...r.kernel.org
Subject: Re: [RFC] mm/migrate: make sure folio_unlock() before
 folio_wait_writeback()

(Adding ext4 list to CC)

On Thu, Oct 02, 2025 at 01:38:59PM +0200, David Hildenbrand wrote:
> > To simplify the scenario:
> > 
> 
> Just curious, where is the __folio_start_writeback() to complete the
> picture?
> 
> >     context X (wq worker)	context Y (process context)
> > 
> > 				migrate_pages_batch()
> >     ext4_end_io_end()		  ...
> >       ...			  migrate_folio_unmap()
> >       ext4_get_inode_loc()	    ...
> >         ...			    folio_lock() // hold the folio lock
> >         bdev_getblk()		    ...
> >           ...			    folio_wait_writeback() // wait forever
> >           __find_get_block_slow()
> >             ...			    ...
> >             folio_lock() // wait forever
> >             folio_unlock()	  migrate_folio_undo_src()
> > 				    ...
> >       ...			    folio_unlock() // never reachable
> >       ext4_finish_bio()
> > 	...
> > 	folio_end_writeback() // never reachable
> > 
> 
> But aren't you implying that it should from this point on be disallowed to
> call folio_wait_writeback() with the folio lock held? That sounds ... a bit
> wrong.
> 
> Note that it is currently explicitly allowed: folio_wait_writeback()
> documents "If the folio is not locked, writeback may start again after
> writeback has finished.". So there is no way to prevent writeback from
> immediately starting again.
> 
> In particular, wouldn't we have to fixup other callsites to make this
> consistent and then VM_WARN_ON_ONCE() assert that in folio_wait_writeback()?
> 
> Of course, as we've never seen this deadlock before in practice, I do wonder
> if something else prevents it?

As far as I can tell, the folio under writeback and the folio that
__find_get_block() finds will _never_ be the same. ext4_end_io_end() is
called for pages in an inode's address_space, and bdev_getblk() is called for
metadata blocks in block cache. Having an actual deadlock here would mean
that the folio is somehow both in an inode's address_space, and in the block
cache, I think? Also, AFAIK there is no way a folio can be removed from the
page cache while under writeback.

In any case, I added linux-ext4 so they can tell me how right/wrong I am.

-- 
Pedro