[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49afa956-21e1-4b3d-9dde-82a6891f2902@redhat.com>
Date: Mon, 21 Oct 2024 18:44:04 +0200
From: David Hildenbrand <david@...hat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Vlastimil Babka <vbabka@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Suren Baghdasaryan <surenb@...gle.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Matthew Wilcox <willy@...radead.org>, "Paul E . McKenney"
<paulmck@...nel.org>, Jann Horn <jannh@...gle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Muchun Song <muchun.song@...ux.dev>,
Richard Henderson <richard.henderson@...aro.org>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>, Matt Turner
<mattst88@...il.com>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
"James E . J . Bottomley" <James.Bottomley@...senpartnership.com>,
Helge Deller <deller@....de>, Chris Zankel <chris@...kel.net>,
Max Filippov <jcmvbkbc@...il.com>, Arnd Bergmann <arnd@...db.de>,
linux-alpha@...r.kernel.org, linux-mips@...r.kernel.org,
linux-parisc@...r.kernel.org, linux-arch@...r.kernel.org,
Shuah Khan <shuah@...nel.org>, Christian Brauner <brauner@...nel.org>,
linux-kselftest@...r.kernel.org, Sidhartha Kumar
<sidhartha.kumar@...cle.com>, Jeff Xu <jeffxu@...omium.org>,
Christoph Hellwig <hch@...radead.org>, linux-api@...r.kernel.org,
John Hubbard <jhubbard@...dia.com>
Subject: Re: [PATCH v2 2/5] mm: add PTE_MARKER_GUARD PTE marker
On 21.10.24 18:23, Lorenzo Stoakes wrote:
> On Mon, Oct 21, 2024 at 06:00:20PM +0200, David Hildenbrand wrote:
> [snip]
>>>
>>> To summarise for on-list:
>>>
>>> * MADV_FREE, while ostensibly being a 'lazy free' mechanism, has the
>>> ability to be 'cancelled' if you write to the memory. Also, after the
>>> freeing is complete, you can write to the memory to reuse it, the mapping
>>> is still there.
>>>
>>> * For hardware poison markers it makes sense to drop them as you're
>>> effectively saying 'I am done with this range that is now unbacked and
>>> expect to get an empty page should I use it now'. UFFD WP I am not sure
>>> about but presumably also fine.
>>>
>>> * However, guard pages are different - if you 'cancel' and you are left
>>> with a block of memory allocated to you by a pthread or userland
>>> allocator implementation, you don't want to then no longer be protected
>>> from overrunning into other thread memory.
>>
>> Agreed. What happens on MADV_DONTNEED/MADV_FREE on guard pages? Ignored or
>> error? It sounds like a usage "error" to me (in contrast to munmap()).
>
> It's ignored, no errror. On MADV_DONTNEED we already left the guard pages in
> place, from v3 we will do the same for MADV_FREE.
>
> I'm not sure I'd say it's an error per se, as somebody might have a use case
> where they want to zap over a range but keep guard pages, perhaps an allocator
> or something?
Hm, not sure I see use for that.
Staring at madvise_walk_vmas(), we return ENOMEM on VMA holes, but would
process PROT_NONE. So current behavior is at least consistent with
PROT_NONE handling (where something could be mapped, though).
No strong opinion.
>
> Also the existing logic is that existing markers (HW poison, uffd-simulated HW
> poison, uffd wp marker) are retained and no error raised on MADV_DONTNEED, and
> no error on MADV_FREE either, so it'd be consistent with existing behaviour.
HW poison / uffd-simulated HW poison are expected to be zapped: it's
just like a mapped page with HWPOISON. So that is correct.
UFFD-WP behavior is ... weird. Would not expect MADV_DONTNEED to zap
uffd-wp entries.
>
> Also semantically you are achieving what the calls expect you are freeing the
> ranges since the guard page regions are unbacked so are already freed... so yeah
> I don't think an error really makes sense here.
I you compare it to a VMA hole, it make sense to fail. If we treat it
like PROT_NONE, it make sense to skip them.
>
> We might also be limiting use cases by assuming they might _only_ be used for
> allocators and such.
I don't buy that as an argument, sorry :)
"Let's map the kernel writable into all user space because otherwise we
might be limiting use cases"
:P
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists