lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8272b9d-5c33-4b44-9d6d-1d25c7ac23dd@redhat.com>
Date: Mon, 21 Oct 2024 19:23:32 +0200
From: David Hildenbrand <david@...hat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 Suren Baghdasaryan <surenb@...gle.com>,
 "Liam R . Howlett" <Liam.Howlett@...cle.com>,
 Matthew Wilcox <willy@...radead.org>, Vlastimil Babka <vbabka@...e.cz>,
 "Paul E . McKenney" <paulmck@...nel.org>, Jann Horn <jannh@...gle.com>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 Muchun Song <muchun.song@...ux.dev>,
 Richard Henderson <richard.henderson@...aro.org>,
 Ivan Kokshaysky <ink@...assic.park.msu.ru>, Matt Turner
 <mattst88@...il.com>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
 "James E . J . Bottomley" <James.Bottomley@...senpartnership.com>,
 Helge Deller <deller@....de>, Chris Zankel <chris@...kel.net>,
 Max Filippov <jcmvbkbc@...il.com>, Arnd Bergmann <arnd@...db.de>,
 linux-alpha@...r.kernel.org, linux-mips@...r.kernel.org,
 linux-parisc@...r.kernel.org, linux-arch@...r.kernel.org,
 Shuah Khan <shuah@...nel.org>, Christian Brauner <brauner@...nel.org>,
 linux-kselftest@...r.kernel.org, Sidhartha Kumar
 <sidhartha.kumar@...cle.com>, Jeff Xu <jeffxu@...omium.org>,
 Christoph Hellwig <hch@...radead.org>, linux-api@...r.kernel.org,
 John Hubbard <jhubbard@...dia.com>
Subject: Re: [PATCH v2 3/5] mm: madvise: implement lightweight guard page
 mechanism

On 21.10.24 19:15, Lorenzo Stoakes wrote:
> On Mon, Oct 21, 2024 at 07:05:27PM +0200, David Hildenbrand wrote:
>> On 20.10.24 18:20, Lorenzo Stoakes wrote:
>>> Implement a new lightweight guard page feature, that is regions of userland
>>> virtual memory that, when accessed, cause a fatal signal to arise.
>>>
>>> Currently users must establish PROT_NONE ranges to achieve this.
>>>
>>> However this is very costly memory-wise - we need a VMA for each and every
>>> one of these regions AND they become unmergeable with surrounding VMAs.
>>>
>>> In addition repeated mmap() calls require repeated kernel context switches
>>> and contention of the mmap lock to install these ranges, potentially also
>>> having to unmap memory if installed over existing ranges.
>>>
>>> The lightweight guard approach eliminates the VMA cost altogether - rather
>>> than establishing a PROT_NONE VMA, it operates at the level of page table
>>> entries - poisoning PTEs such that accesses to them cause a fault followed
>>> by a SIGSGEV signal being raised.
>>>
>>> This is achieved through the PTE marker mechanism, which a previous commit
>>> in this series extended to permit this to be done, installed via the
>>> generic page walking logic, also extended by a prior commit for this
>>> purpose.
>>>
>>> These poison ranges are established with MADV_GUARD_POISON, and if the
>>> range in which they are installed contain any existing mappings, they will
>>> be zapped, i.e. free the range and unmap memory (thus mimicking the
>>> behaviour of MADV_DONTNEED in this respect).
>>>
>>> Any existing poison entries will be left untouched. There is no nesting of
>>> poisoned pages.
>>>
>>> Poisoned ranges are NOT cleared by MADV_DONTNEED, as this would be rather
>>> unexpected behaviour, but are cleared on process teardown or unmapping of
>>> memory ranges.
>>>
>>> Ranges can have the poison property removed by MADV_GUARD_UNPOISON -
>>> 'remedying' the poisoning. The ranges over which this is applied, should
>>> they contain non-poison entries, will be untouched, only poison entries
>>> will be cleared.
>>>
>>> We permit this operation on anonymous memory only, and only VMAs which are
>>> non-special, non-huge and not mlock()'d (if we permitted this we'd have to
>>> drop locked pages which would be rather counterintuitive).
>>>
>>> Suggested-by: Vlastimil Babka <vbabka@...e.cz>
>>> Suggested-by: Jann Horn <jannh@...gle.com>
>>> Suggested-by: David Hildenbrand <david@...hat.com>
>>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
>>> ---
>>>    arch/alpha/include/uapi/asm/mman.h     |   3 +
>>>    arch/mips/include/uapi/asm/mman.h      |   3 +
>>>    arch/parisc/include/uapi/asm/mman.h    |   3 +
>>>    arch/xtensa/include/uapi/asm/mman.h    |   3 +
>>>    include/uapi/asm-generic/mman-common.h |   3 +
>>>    mm/madvise.c                           | 168 +++++++++++++++++++++++++
>>>    mm/mprotect.c                          |   3 +-
>>>    mm/mseal.c                             |   1 +
>>>    8 files changed, 186 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
>>> index 763929e814e9..71e13f27742d 100644
>>> --- a/arch/alpha/include/uapi/asm/mman.h
>>> +++ b/arch/alpha/include/uapi/asm/mman.h
>>> @@ -78,6 +78,9 @@
>>>    #define MADV_COLLAPSE	25		/* Synchronous hugepage collapse */
>>> +#define MADV_GUARD_POISON 102		/* fatal signal on access to range */
>>> +#define MADV_GUARD_UNPOISON 103		/* revoke guard poisoning */
>>
>> Just to raise it here: MADV_GUARD_INSTALL / MADV_GUARD_REMOVE or sth. like
>> that would have been even clearer, at least to me.
> 
> :)
> 
> It still feels like poisoning to me because we're explicitly putting
> something in the page tables to make a range have different fault behaviour
> like a HW poisoning, and 'installing' suggests backing or something like
> this, I think that's more confusing.

I connect "poison" to "SIGBUS" and "corrupt memory state", not to "there 
is nothing and there must not be anything". Thus my thinking. But again, 
not the end of the world, just wanted to raise it ...

> 
>>
>> But no strong opinion, just if somebody else reading along was wondering
>> about the same.
>>
>>
>> I'm hoping to find more time to have a closer look at this this week, but in
>> general, the concept sounds reasonable to me.
> 
> Thanks!

Thank you for implementing this and up-streaming it!

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ