[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45c64b49-a38b-4b0c-d9cf-6c586dacbcc9@arm.com>
Date: Mon, 26 Oct 2020 17:39:42 -0500
From: Jeremy Linton <jeremy.linton@....com>
To: Dave Martin <Dave.Martin@....com>,
Szabolcs Nagy <szabolcs.nagy@....com>
Cc: Mark Rutland <mark.rutland@....com>,
systemd-devel@...ts.freedesktop.org,
Kees Cook <keescook@...omium.org>,
Catalin Marinas <Catalin.Marinas@....com>,
Will Deacon <will.deacon@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Mark Brown <broonie@...nel.org>, toiwoton@...il.com,
libc-alpha@...rceware.org,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: BTI interaction between seccomp filters in systemd and glibc
mprotect calls, causing service failures
Hi,
On 10/26/20 12:52 PM, Dave Martin wrote:
> On Mon, Oct 26, 2020 at 04:57:55PM +0000, Szabolcs Nagy via Libc-alpha wrote:
>> The 10/26/2020 16:24, Dave Martin via Libc-alpha wrote:
>>> Unrolling this discussion a bit, this problem comes from a few sources:
>>>
>>> 1) systemd is trying to implement a policy that doesn't fit SECCOMP
>>> syscall filtering very well.
>>>
>>> 2) The program is trying to do something not expressible through the
>>> syscall interface: really the intent is to set PROT_BTI on the page,
>>> with no intent to set PROT_EXEC on any page that didn't already have it
>>> set.
>>>
>>>
>>> This limitation of mprotect() was known when I originally added PROT_BTI,
>>> but at that time we weren't aware of a clear use case that would fail.
>>>
>>>
>>> Would it now help to add something like:
>>>
>>> int mchangeprot(void *addr, size_t len, int old_flags, int new_flags)
>>> {
>>> int ret = -EINVAL;
>>> mmap_write_lock(current->mm);
>>> if (all vmas in [addr .. addr + len) have
>>> their mprotect flags set to old_flags) {
>>>
>>> ret = mprotect(addr, len, new_flags);
>>> }
>>>
>>> mmap_write_unlock(current->mm);
>>> return ret;
>>> }
>>
>> if more prot flags are introduced then the exact
>> match for old_flags may be restrictive and currently
>> there is no way to query these flags to figure out
>> how to toggle one prot flag in a future proof way,
>> so i don't think this solves the issue completely.
>
> Ack -- I illustrated this model because it makes the seccomp filter's
> job easy, but it does have limitations.
>
>> i think we might need a new api, given that aarch64
>> now has PROT_BTI and PROT_MTE while existing code
>> expects RWX only, but i don't know what api is best.
>
> An alternative option would be a call that sets / clears chosen
> flags and leaves others unchanged.
I tend to favor a set/clear API, but that could also just be done by
creating a new PROT_BTI_IF_X which enables BTI for areas already set to
_EXEC. That goes right by the seccomp filters too, and actually is
closer to what glibc wants to do anyway.
>
> The trouble with that is that the MDWX policy then becomes hard to
> implement again.
>
>
> But policies might be best set via another route, such as a prctl,
> rather than being implemented completely in a seccomp filter.
>
> Cheers
> ---Dave
>
Powered by blists - more mailing lists