[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a2becf1-fc19-a7da-deb7-1c12781d503d@gmail.com>
Date: Wed, 13 Apr 2022 21:39:37 +0300
From: Topi Miettinen <toiwoton@...il.com>
To: Catalin Marinas <catalin.marinas@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Hellwig <hch@...radead.org>,
Lennart Poettering <lennart@...ttering.net>,
Zbigniew Jędrzejewski-Szmek <zbyszek@...waw.pl>
Cc: Will Deacon <will@...nel.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Eric Biederman <ebiederm@...ssion.com>,
Kees Cook <keescook@...omium.org>,
Szabolcs Nagy <szabolcs.nagy@....com>,
Mark Brown <broonie@...nel.org>,
Jeremy Linton <jeremy.linton@....com>, linux-mm@...ck.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-abi-devel@...ts.sourceforge.net
Subject: Re: [PATCH RFC 0/4] mm, arm64: In-kernel support for
memory-deny-write-execute (MDWE)
On 13.4.2022 16.49, Catalin Marinas wrote:
> Hi,
>
> The background to this is that systemd has a configuration option called
> MemoryDenyWriteExecute [1], implemented as a SECCOMP BPF filter. Its aim
> is to prevent a user task from inadvertently creating an executable
> mapping that is (or was) writeable. Since such BPF filter is stateless,
> it cannot detect mappings that were previously writeable but
> subsequently changed to read-only. Therefore the filter simply rejects
> any mprotect(PROT_EXEC). The side-effect is that on arm64 with BTI
> support (Branch Target Identification), the dynamic loader cannot change
> an ELF section from PROT_EXEC to PROT_EXEC|PROT_BTI using mprotect().
> For libraries, it can resort to unmapping and re-mapping but for the
> main executable it does not have a file descriptor. The original bug
> report in the Red Hat bugzilla - [2] - and subsequent glibc workaround
> for libraries - [3].
>
> Add in-kernel support for such feature as a DENY_WRITE_EXEC personality
> flag, inherited on fork() and execve(). The kernel tracks a previously
> writeable mapping via a new VM_WAS_WRITE flag (64-bit only
> architectures). I went for a personality flag by analogy with the
> READ_IMPLIES_EXEC one. However, I'm happy to change it to a prctl() if
> we don't want more personality flags. A minor downside with the
> personality flag is that there is no way for the user to query which
> flags are supported, so in patch 3 I added an AT_FLAGS bit to advertise
> this.
With systemd there's a BPF construct to block personality changes
(LockPersonality=yes) but I think prctl() would be easier to lock down
irrevocably.
Requiring or implying NoNewPrivileges could prevent nasty surprises from
set-uid Python programs which happen to use FFI.
> Posting this as an RFC to start a discussion and cc'ing some of the
> systemd guys and those involved in the earlier thread around the glibc
> workaround for dynamic libraries [4]. Before thinking of upstreaming
> this we'd need the systemd folk to buy into replacing the MDWE SECCOMP
> BPF filter with the in-kernel one.
As the author of this feature in systemd (also similar feature in
Firejail), I'd highly prefer in-kernel version to BPF protection. I'd
definitely also want to use this in place of BPF on x86_64 and other
arches too.
In-kernel version would probably allow covering pretty easily this case
(maybe it already does):
fd = memfd_create(...);
write(fd, malicious_code, sizeof(malicious_code));
mmap(..., PROT_EXEC, ..., fd);
Other memory W^X implementations include S.A.R.A [1] and SELinux
EXECMEM/EXECSTACK/EXECHEAP protections [2], [3]. SELinux checks
IS_PRIVATE(file_inode(file)) and vma->anon_vma != NULL, which might be
useful additions here too (or future extensions if you prefer).
-Topi
[1] https://smeso.it/sara/
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/security/selinux/hooks.c#n3708
[3]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/security/selinux/hooks.c#n3787
Powered by blists - more mailing lists