lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <qte6322kbhn3xydiukyitgn73lbepaqlhqq43mdwhyycgdeuho@5b6wty5mcclt>
Date: Mon, 27 Oct 2025 10:40:04 +0000
From: Kiryl Shutsemau <kirill@...temov.name>
To: Hugh Dickins <hughd@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, 
	David Hildenbrand <david@...hat.com>, Matthew Wilcox <willy@...radead.org>, 
	Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>, 
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
	Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>, 
	Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>, Rik van Riel <riel@...riel.com>, 
	Harry Yoo <harry.yoo@...cle.com>, Johannes Weiner <hannes@...xchg.org>, 
	Shakeel Butt <shakeel.butt@...ux.dev>, Baolin Wang <baolin.wang@...ux.alibaba.com>, 
	"Darrick J. Wong" <djwong@...nel.org>, Dave Chinner <david@...morbit.com>, linux-mm@...ck.org, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2 2/2] mm/truncate: Unmap large folio on split failure

On Mon, Oct 27, 2025 at 03:10:29AM -0700, Hugh Dickins wrote:
> On Thu, 23 Oct 2025, Kiryl Shutsemau wrote:
> 
> > From: Kiryl Shutsemau <kas@...nel.org>
> > 
> > Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
> > supposed to generate SIGBUS.
> > 
> > This behavior might not be respected on truncation.
> > 
> > During truncation, the kernel splits a large folio in order to reclaim
> > memory. As a side effect, it unmaps the folio and destroys PMD mappings
> > of the folio. The folio will be refaulted as PTEs and SIGBUS semantics
> > are preserved.
> > 
> > However, if the split fails, PMD mappings are preserved and the user
> > will not receive SIGBUS on any accesses within the PMD.
> > 
> > Unmap the folio on split failure. It will lead to refault as PTEs and
> > preserve SIGBUS semantics.
> > 
> > Signed-off-by: Kiryl Shutsemau <kas@...nel.org>
> 
> It's taking me too long to understand what truncate_inode_partial_folio()
> had become before your changes, to be very sure of your changes to it.
> 
> But if your commit does indeed achieve what's intended, then I have
> no objection to it applying to shmem/tmpfs as well as other filesystems:
> we always hope that a split will succeed, so I don't mind you tightening
> up what is done when it fails.
> 
> However, a few points that have occurred to me...
> 
> If 1/2's exception for shmem/tmpfs huge=always does the simple thing,
> of just judging by whether a huge page is already there in the file
> (without reference to mount option), which I think is okay: then
> this 2/2 will not be doing anything useful on shmem/tmpfs, because
> when the split fails, the huge page will remain, and after 2/2's
> unmap it will just get remapped by PMD again afterwards, so why
> bother to unmap at all in the shmem/tmpfs case?.
> 
> But it's arguable whether it would then be worth making an
> exception for shmem/tmpfs here in 2/2 - how much do we care about
> optimizing failed splits?  At least a comment I guess, but you
> might prefer to do it quite differently.

It is easy enough to skip unmap for shmem.

> Aside from shmem/tmpfs, it does seem to me that this patch is
> doing more work than it needs to (but how many lines of source
> do we want to add to avoid doing work in the failed split case?):
> 
> The intent is to enable SIGBUS beyond EOF: but the changes are
> being applied unnecessarily to hole-punch in addition to truncation.

I am not sure much it should apply to hole-punch. Filesystem folks talk
about writing to a folio beyond round_up(i_size, PAGE_SIZE) being
problematic for correctness. I have no clue if the same applies to
writing to hole-punched parts of the folio.

Dave, any comments?

Hm. But if it is problematic it has be caught on fault. We don't do
this. It will be silently mapped.

> Does the folio2 part actually need to unmap again?  And if it does,
> then what about when its trylock failed?  But it's hole-punch anyway.

I don't think we can do much for !trylock case, unless we a willing to
upgrade it to folio_lock(). try_to_unmap() requires the folio to be
locked or we will race with fault.

Maybe folio_lock() is not too bad here for freshly split folio.

> And a final nit: I'd delete that WARN_ON(folio_mapped(folio)) myself,
> all it could ever achieve is perhaps a very rare syzbot report which
> nobody would want to spend time on.

David asked for it. I am fine either way.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ