[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <qte6322kbhn3xydiukyitgn73lbepaqlhqq43mdwhyycgdeuho@5b6wty5mcclt>
Date: Mon, 27 Oct 2025 10:40:04 +0000
From: Kiryl Shutsemau <kirill@...temov.name>
To: Hugh Dickins <hughd@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>, Matthew Wilcox <willy@...radead.org>,
Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>, Rik van Riel <riel@...riel.com>,
Harry Yoo <harry.yoo@...cle.com>, Johannes Weiner <hannes@...xchg.org>,
Shakeel Butt <shakeel.butt@...ux.dev>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Darrick J. Wong" <djwong@...nel.org>, Dave Chinner <david@...morbit.com>, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2 2/2] mm/truncate: Unmap large folio on split failure
On Mon, Oct 27, 2025 at 03:10:29AM -0700, Hugh Dickins wrote:
> On Thu, 23 Oct 2025, Kiryl Shutsemau wrote:
>
> > From: Kiryl Shutsemau <kas@...nel.org>
> >
> > Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
> > supposed to generate SIGBUS.
> >
> > This behavior might not be respected on truncation.
> >
> > During truncation, the kernel splits a large folio in order to reclaim
> > memory. As a side effect, it unmaps the folio and destroys PMD mappings
> > of the folio. The folio will be refaulted as PTEs and SIGBUS semantics
> > are preserved.
> >
> > However, if the split fails, PMD mappings are preserved and the user
> > will not receive SIGBUS on any accesses within the PMD.
> >
> > Unmap the folio on split failure. It will lead to refault as PTEs and
> > preserve SIGBUS semantics.
> >
> > Signed-off-by: Kiryl Shutsemau <kas@...nel.org>
>
> It's taking me too long to understand what truncate_inode_partial_folio()
> had become before your changes, to be very sure of your changes to it.
>
> But if your commit does indeed achieve what's intended, then I have
> no objection to it applying to shmem/tmpfs as well as other filesystems:
> we always hope that a split will succeed, so I don't mind you tightening
> up what is done when it fails.
>
> However, a few points that have occurred to me...
>
> If 1/2's exception for shmem/tmpfs huge=always does the simple thing,
> of just judging by whether a huge page is already there in the file
> (without reference to mount option), which I think is okay: then
> this 2/2 will not be doing anything useful on shmem/tmpfs, because
> when the split fails, the huge page will remain, and after 2/2's
> unmap it will just get remapped by PMD again afterwards, so why
> bother to unmap at all in the shmem/tmpfs case?.
>
> But it's arguable whether it would then be worth making an
> exception for shmem/tmpfs here in 2/2 - how much do we care about
> optimizing failed splits? At least a comment I guess, but you
> might prefer to do it quite differently.
It is easy enough to skip unmap for shmem.
> Aside from shmem/tmpfs, it does seem to me that this patch is
> doing more work than it needs to (but how many lines of source
> do we want to add to avoid doing work in the failed split case?):
>
> The intent is to enable SIGBUS beyond EOF: but the changes are
> being applied unnecessarily to hole-punch in addition to truncation.
I am not sure much it should apply to hole-punch. Filesystem folks talk
about writing to a folio beyond round_up(i_size, PAGE_SIZE) being
problematic for correctness. I have no clue if the same applies to
writing to hole-punched parts of the folio.
Dave, any comments?
Hm. But if it is problematic it has be caught on fault. We don't do
this. It will be silently mapped.
> Does the folio2 part actually need to unmap again? And if it does,
> then what about when its trylock failed? But it's hole-punch anyway.
I don't think we can do much for !trylock case, unless we a willing to
upgrade it to folio_lock(). try_to_unmap() requires the folio to be
locked or we will race with fault.
Maybe folio_lock() is not too bad here for freshly split folio.
> And a final nit: I'd delete that WARN_ON(folio_mapped(folio)) myself,
> all it could ever achieve is perhaps a very rare syzbot report which
> nobody would want to spend time on.
David asked for it. I am fine either way.
--
Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists