lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CAKbZUD2A+=bp_sd+Q0Yif7NJqMu8p__eb4yguq0agEcmLH8SDQ@mail.gmail.com> Date: Tue, 17 Oct 2023 16:29:58 +0100 From: Pedro Falcato <pedro.falcato@...il.com> To: Matthew Wilcox <willy@...radead.org> Cc: jeffxu@...omium.org, akpm@...ux-foundation.org, keescook@...omium.org, sroettger@...gle.com, jeffxu@...gle.com, jorgelo@...omium.org, groeck@...omium.org, linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org, linux-mm@...ck.org, jannh@...gle.com, surenb@...gle.com, alex.sierra@....com, apopple@...dia.com, aneesh.kumar@...ux.ibm.com, axelrasmussen@...gle.com, ben@...adent.org.uk, catalin.marinas@....com, david@...hat.com, dwmw@...zon.co.uk, ying.huang@...el.com, hughd@...gle.com, joey.gouly@....com, corbet@....net, wangkefeng.wang@...wei.com, Liam.Howlett@...cle.com, torvalds@...ux-foundation.org, lstoakes@...il.com, mawupeng1@...wei.com, linmiaohe@...wei.com, namit@...are.com, peterx@...hat.com, peterz@...radead.org, ryan.roberts@....com, shr@...kernel.io, vbabka@...e.cz, xiujianfeng@...wei.com, yu.ma@...el.com, zhangpeng362@...wei.com, dave.hansen@...el.com, luto@...nel.org, linux-hardening@...r.kernel.org Subject: Re: [RFC PATCH v1 0/8] Introduce mseal() syscall On Mon, Oct 16, 2023 at 4:18 PM Matthew Wilcox <willy@...radead.org> wrote: > > On Mon, Oct 16, 2023 at 02:38:19PM +0000, jeffxu@...omium.org wrote: > > Modern CPUs support memory permissions such as RW and NX bits. Linux has > > supported NX since the release of kernel version 2.6.8 in August 2004 [1]. > > This seems like a confusing way to introduce the subject. Here, you're > talking about page permissions, whereas (as far as I can tell), mseal() is > about making _virtual_ addresses immutable, for some value of immutable. > > > Memory sealing additionally protects the mapping itself against > > modifications. This is useful to mitigate memory corruption issues where > > a corrupted pointer is passed to a memory management syscall. For example, > > such an attacker primitive can break control-flow integrity guarantees > > since read-only memory that is supposed to be trusted can become writable > > or .text pages can get remapped. Memory sealing can automatically be > > applied by the runtime loader to seal .text and .rodata pages and > > applications can additionally seal security critical data at runtime. > > A similar feature already exists in the XNU kernel with the > > VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the mimmutable syscall [4]. > > Also, Chrome wants to adopt this feature for their CFI work [2] and this > > patchset has been designed to be compatible with the Chrome use case. > > This [2] seems very generic and wide-ranging, not helpful. [5] was more > useful to understand what you're trying to do. > > > The new mseal() is an architecture independent syscall, and with > > following signature: > > > > mseal(void addr, size_t len, unsigned int types, unsigned int flags) > > > > addr/len: memory range. Must be continuous/allocated memory, or else > > mseal() will fail and no VMA is updated. For details on acceptable > > arguments, please refer to comments in mseal.c. Those are also fully > > covered by the selftest. > > Mmm. So when you say "continuous/allocated" what you really mean is > "Must have contiguous VMAs" rather than "All pages in this range must > be populated", yes? > > > types: bit mask to specify which syscall to seal, currently they are: > > MM_SEAL_MSEAL 0x1 > > MM_SEAL_MPROTECT 0x2 > > MM_SEAL_MUNMAP 0x4 > > MM_SEAL_MMAP 0x8 > > MM_SEAL_MREMAP 0x10 > > I don't understand why we want this level of granularity. The OpenBSD > and XNU examples just say "This must be immutable*". For values of > immutable that allow downgrading access (eg RW to RO or RX to RO), > but not upgrading access (RW->RX, RO->*, RX->RW). > > > Each bit represents sealing for one specific syscall type, e.g. > > MM_SEAL_MPROTECT will deny mprotect syscall. The consideration of bitmask > > is that the API is extendable, i.e. when needed, the sealing can be > > extended to madvise, mlock, etc. Backward compatibility is also easy. > > Honestly, it feels too flexible. Why not just two flags to mprotect() > -- PROT_IMMUTABLE and PROT_DOWNGRADABLE. I can see a use for that -- > maybe for some things we want to be able to downgrade and for other > things, we don't. I think it's worth pointing out that this suggestion (with PROT_*) could easily integrate with mmap() and as such allow for one-shot mmap() + mseal(). If we consider the common case as 'addr = mmap(...); mseal(addr);', it definitely sounds like a performance win as we halve the number of syscalls for a sealed mapping. And if we trivially look at e.g OpenBSD ld.so code, mmap() + mimmutable() and mprotect() + mimmutable() seem like common patterns. -- Pedro
Powered by blists - more mailing lists