[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9b400450-46bc-41c7-9e89-825993851101@redhat.com>
Date: Wed, 10 Jul 2024 06:05:34 +0200
From: David Hildenbrand <david@...hat.com>
To: "Jason A. Donenfeld" <Jason@...c4.com>, linux-kernel@...r.kernel.org,
patches@...ts.linux.dev, tglx@...utronix.de
Cc: linux-crypto@...r.kernel.org, linux-api@...r.kernel.org, x86@...nel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>,
Carlos O'Donell <carlos@...hat.com>, Florian Weimer <fweimer@...hat.com>,
Arnd Bergmann <arnd@...db.de>, Jann Horn <jannh@...gle.com>,
Christian Brauner <brauner@...nel.org>,
David Hildenbrand <dhildenb@...hat.com>, linux-mm@...ck.org
Subject: Re: [PATCH v22 1/4] mm: add MAP_DROPPABLE for designating always
lazily freeable mappings
On 10.07.24 05:27, David Hildenbrand wrote:
> On 09.07.24 15:05, Jason A. Donenfeld wrote:
>> The vDSO getrandom() implementation works with a buffer allocated with a
>> new system call that has certain requirements:
>>
>> - It shouldn't be written to core dumps.
>> * Easy: VM_DONTDUMP.
>> - It should be zeroed on fork.
>> * Easy: VM_WIPEONFORK.
>>
>> - It shouldn't be written to swap.
>> * Uh-oh: mlock is rlimited.
>> * Uh-oh: mlock isn't inherited by forks.
>>
>> It turns out that the vDSO getrandom() function has three really nice
>> characteristics that we can exploit to solve this problem:
>>
>> 1) Due to being wiped during fork(), the vDSO code is already robust to
>> having the contents of the pages it reads zeroed out midway through
>> the function's execution.
>>
>> 2) In the absolute worst case of whatever contingency we're coding for,
>> we have the option to fallback to the getrandom() syscall, and
>> everything is fine.
>>
>> 3) The buffers the function uses are only ever useful for a maximum of
>> 60 seconds -- a sort of cache, rather than a long term allocation.
>>
>> These characteristics mean that we can introduce VM_DROPPABLE, which
>> has the following semantics:
>>
>> a) It never is written out to swap.
>> b) Under memory pressure, mm can just drop the pages (so that they're
>> zero when read back again).
>> c) It is inherited by fork.
>> d) It doesn't count against the mlock budget, since nothing is locked.
>>
>> This is fairly simple to implement, with the one snag that we have to
>> use 64-bit VM_* flags, but this shouldn't be a problem, since the only
>> consumers will probably be 64-bit anyway.
>>
>> This way, allocations used by vDSO getrandom() can use:
>>
>> VM_DROPPABLE | VM_DONTDUMP | VM_WIPEONFORK | VM_NORESERVE
>>
>> And there will be no problem with using memory when not in use, not
>> wiping on fork(), coredumps, or writing out to swap.
>>
>> In order to let vDSO getrandom() use this, expose these via mmap(2) as
>> MAP_DROPPABLE.
>>
>> Finally, the provided self test ensures that this is working as desired.
>
> Acked-by: David Hildenbrand <david@...hat.com>
>
>
> I'll try to think of some corner cases we might be missing.
BTW, do we have to handle the folio_set_swapbacked() in sort_folio() as well?
/* dirty lazyfree */
if (type == LRU_GEN_FILE && folio_test_anon(folio) && folio_test_dirty(folio)) {
success = lru_gen_del_folio(lruvec, folio, true);
VM_WARN_ON_ONCE_FOLIO(!success, folio);
folio_set_swapbacked(folio);
lruvec_add_folio_tail(lruvec, folio);
return true;
}
Maybe more difficult because we don't have a VMA here ... hmm
IIUC, we have to make sure that no folio_set_swapbacked() would ever get
performed on these folios, correct?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists