[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f4c0a45-f219-4d95-b5d7-b4ca1bc9540b@redhat.com>
Date: Wed, 25 Jun 2025 10:47:49 +0200
From: David Hildenbrand <david@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
nvdimm@...ts.linux.dev, Andrew Morton <akpm@...ux-foundation.org>,
Juergen Gross <jgross@...e.com>, Stefano Stabellini
<sstabellini@...nel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
Dan Williams <dan.j.williams@...el.com>, Alistair Popple
<apopple@...dia.com>, Matthew Wilcox <willy@...radead.org>,
Jan Kara <jack@...e.cz>, Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Zi Yan <ziy@...dia.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Nico Pache <npache@...hat.com>,
Ryan Roberts <ryan.roberts@....com>, Dev Jain <dev.jain@....com>,
Barry Song <baohua@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
Mike Rapoport <rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>, Jann Horn <jannh@...gle.com>,
Pedro Falcato <pfalcato@...e.de>
Subject: Re: [PATCH RFC 11/14] mm: remove "horrible special case to handle
copy-on-write behaviour"
On 17.06.25 17:43, David Hildenbrand wrote:
> Let's make the kernel a bit less horrible, by removing the
> linearity requirement in CoW PFNMAP mappings with
> !CONFIG_ARCH_HAS_PTE_SPECIAL. In particular, stop messing with
> vma->vm_pgoff in weird ways.
>
> Simply lookup in applicable (i.e., CoW PFNMAP) mappings whether we
> have an anon folio.
>
> Nobody should ever try mapping anon folios using PFNs, that just screams
> for other possible issues. To be sure, let's sanity-check when inserting
> PFNs. Are they really required? Probably not, but it's a good safety net
> at least for now.
>
> The runtime overhead should be limited: there is nothing to do for !CoW
> mappings (common case), and archs that care about performance
> (i.e., GUP-fast) should be supporting CONFIG_ARCH_HAS_PTE_SPECIAL
> either way.
>
> Likely the sanity checks added in mm/huge_memory.c are not required for
> now, because that code is probably only wired up with
> CONFIG_ARCH_HAS_PTE_SPECIAL, but this way is certainly cleaner and
> more consistent -- and doesn't really cost us anything in the cases we
> really care about.
>
> Signed-off-by: David Hildenbrand <david@...hat.com>
> ---
I'm still thinking about this patch here, and will likely send out the
other patches first as a v1, and come back to this one later.
Really, someone mapping random memory using /dev/mem, and then getting
anonymous memory in there is the (nasty) corner case I ignored.
There are rather nasty ways of trying to detect if an anon folio really
fits into a VMA, but I'd like to avoid that.
What I am thinking about right now is that we could, for these special
architectures, simply disallow CoW faults on /dev/mem.
So we would still allow MAP_PRIVATE mappings (e.g., random app opening
/dev/mem using MAP_PRIVATE but never actually writing to that memory),
but the actual CoW faults would fail without pte_special().
Some more thinking to do ...
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists