lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9a586b5b-c47f-45eb-83c8-1e86431fc83d@redhat.com>
Date: Thu, 2 Oct 2025 13:38:59 +0200
From: David Hildenbrand <david@...hat.com>
To: Byungchul Park <byungchul@...com>, akpm@...ux-foundation.org
Cc: ziy@...dia.com, matthew.brost@...el.com, joshua.hahnjy@...il.com,
 rakie.kim@...com, gourry@...rry.net, ying.huang@...ux.alibaba.com,
 apopple@...dia.com, clameter@....com, kravetz@...ibm.com,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 max.byungchul.park@...il.com, kernel_team@...ynix.com, harry.yoo@...cle.com,
 gwan-gyeong.mun@...el.com, yeoreum.yun@....com, syzkaller@...glegroups.com,
 ysk@...lloc.com, Matthew Wilcox <willy@...radead.org>
Subject: Re: [RFC] mm/migrate: make sure folio_unlock() before
 folio_wait_writeback()

> To simplify the scenario:
> 

Just curious, where is the __folio_start_writeback() to complete the 
picture?

>     context X (wq worker)	context Y (process context)
> 
> 				migrate_pages_batch()
>     ext4_end_io_end()		  ...
>       ...			  migrate_folio_unmap()
>       ext4_get_inode_loc()	    ...
>         ...			    folio_lock() // hold the folio lock
>         bdev_getblk()		    ...
>           ...			    folio_wait_writeback() // wait forever
>           __find_get_block_slow()
>             ...			    ...
>             folio_lock() // wait forever
>             folio_unlock()	  migrate_folio_undo_src()
> 				    ...
>       ...			    folio_unlock() // never reachable
>       ext4_finish_bio()
> 	...
> 	folio_end_writeback() // never reachable
> 

But aren't you implying that it should from this point on be disallowed 
to call folio_wait_writeback() with the folio lock held? That sounds ... 
a bit wrong.

Note that it is currently explicitly allowed: folio_wait_writeback() 
documents "If the folio is not locked, writeback may start again after 
writeback has finished.". So there is no way to prevent writeback from 
immediately starting again.

In particular, wouldn't we have to fixup other callsites to make this 
consistent and then VM_WARN_ON_ONCE() assert that in folio_wait_writeback()?

Of course, as we've never seen this deadlock before in practice, I do 
wonder if something else prevents it?

If it's a real issue, I wonder if a trylock on the writeback path could 
be an option.

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ