[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPoTw1qaEhU5CYmI@dread.disaster.area>
Date: Thu, 23 Oct 2025 22:38:43 +1100
From: Dave Chinner <david@...morbit.com>
To: Kiryl Shutsemau <kirill@...temov.name>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>,
Hugh Dickins <hughd@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>, Rik van Riel <riel@...riel.com>,
Harry Yoo <harry.yoo@...cle.com>,
Johannes Weiner <hannes@...xchg.org>,
Shakeel Butt <shakeel.butt@...ux.dev>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Darrick J. Wong" <djwong@...nel.org>, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC, PATCH 0/2] Large folios vs. SIGBUS semantics
On Tue, Oct 21, 2025 at 07:16:26AM +0100, Kiryl Shutsemau wrote:
> On Tue, Oct 21, 2025 at 10:28:02AM +1100, Dave Chinner wrote:
> > In critical paths like truncate, correctness and safety come first.
> > Performance is only a secondary consideration. The overlap of
> > mmap() and truncate() is an area where we have had many, many bugs
> > and, at minimum, the current POSIX behaviour largely shields us from
> > serious stale data exposure events when those bugs (inevitably)
> > occur.
>
> How do you prevent writes via GUP racing with truncate()?
>
> Something like this:
>
> CPU0 CPU1
> fd = open("file")
> p = mmap(fd)
> whatever_syscall(p)
> get_user_pages(p, &page)
> truncate("file");
> <write to page>
> put_page(page);
Forget about truncate, go look at the comment above
writable_file_mapping_allowed() about using GUP this way.
i.e. file-backed mmap/GUP is a known broken anti-pattern. We've
spent the past 15+ years telling people that it is unfixably broken
and they will crash their kernel or corrupt there data if they do
this.
This is not supported functionality because real world production
use ends up exposing problems with sync and background writeback
races, truncate races, fallocate() races, writes into holes, writes
into preallocated regions, writes over shared extents that require
copy-on-write, etc, etc, ad nausiem.
If anyone is using filebacked mappings like this, then when it
breaks they get to keep all the broken pieces to themselves.
> The GUP can pin a page in the middle of a large folio well beyond the
> truncation point. The folio will not be split on truncation due to the
> elevated pin.
>
> I don't think this issue can be fundamentally fixed as long as we allow
> GUP for file-backed memory.
Yup, but that's the least of the problems with GUP on file-backed
pages...
> If the filesystem side cannot handle a non-zeroed tail of a large folio,
> this SIGBUS semantics only hides the issue instead of addressing it.
The objections raised have not related to whether a filesystem
"cannot handle" this case or not. The concerns are about a change of
behaviour in a well known, widely documented API, as well as the
significant increase in surface area of potential data exposure it
would enable should there be Yet Another Truncate Bug Again Once
More.
-Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists