lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 May 2023 13:53:09 +0300
From:   Alexey Izbyshev <izbyshev@...ras.ru>
To:     David Hildenbrand <david@...hat.com>
Cc:     Florent Revest <revest@...omium.org>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, akpm@...ux-foundation.org,
        catalin.marinas@....com, anshuman.khandual@....com,
        joey.gouly@....com, mhocko@...e.com, keescook@...omium.org,
        peterx@...hat.com, broonie@...nel.org, szabolcs.nagy@....com,
        kpsingh@...nel.org, gthelen@...gle.com, toiwoton@...il.com
Subject: Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long

On 2023-05-23 12:12, David Hildenbrand wrote:
> On 22.05.23 20:58, Alexey Izbyshev wrote:
>> On 2023-05-22 19:22, David Hildenbrand wrote:
>>> On 22.05.23 12:35, Alexey Izbyshev wrote:
>>>> On 2023-05-22 11:55, David Hildenbrand wrote:
>>>>> On 17.05.23 17:03, Florent Revest wrote:
>>>>>> Alexey pointed out that defining a prctl flag as an int is a 
>>>>>> footgun
>>>>>> because, under some circumstances, when used as a flag to prctl, 
>>>>>> it
>>>>>> can
>>>>>> be casted to long with garbage upper bits which would result in
>>>>>> unexpected behaviors.
>>>>>> 
>>>>>> This patch changes the constant to a UL to eliminate these
>>>>>> possibilities.
>>>>>> 
>>>>>> Signed-off-by: Florent Revest <revest@...omium.org>
>>>>>> Suggested-by: Alexey Izbyshev <izbyshev@...ras.ru>
>>>>>> ---
>>>>>>     include/uapi/linux/prctl.h       | 2 +-
>>>>>>     tools/include/uapi/linux/prctl.h | 2 +-
>>>>>>     2 files changed, 2 insertions(+), 2 deletions(-)
>>>>>> 
>>>>>> diff --git a/include/uapi/linux/prctl.h 
>>>>>> b/include/uapi/linux/prctl.h
>>>>>> index f23d9a16507f..6e9af6cbc950 100644
>>>>>> --- a/include/uapi/linux/prctl.h
>>>>>> +++ b/include/uapi/linux/prctl.h
>>>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>>>>>>       /* Memory deny write / execute */
>>>>>>     #define PR_SET_MDWE			65
>>>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
>>>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>>>>>>       #define PR_GET_MDWE			66
>>>>>>     diff --git a/tools/include/uapi/linux/prctl.h
>>>>>> b/tools/include/uapi/linux/prctl.h
>>>>>> index 759b3f53e53f..6e6563e97fef 100644
>>>>>> --- a/tools/include/uapi/linux/prctl.h
>>>>>> +++ b/tools/include/uapi/linux/prctl.h
>>>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>>>>>>       /* Memory deny write / execute */
>>>>>>     #define PR_SET_MDWE			65
>>>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
>>>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>>>>>>       #define PR_GET_MDWE			66
>>>>>> 
>>>>> 
>>>>> Both are changing existing uapi, so you'll already have existing 
>>>>> user
>>>>> space using the old values, that your kernel code has to deal with
>>>>> no?
>>>> 
>>>> I'm the one who suggested this change, so I feel the need to 
>>>> clarify.
>>>> 
>>>> For any existing 64-bit user space code using the kernel and the 
>>>> uapi
>>>> headers before this patch and doing the wrong prctl(PR_SET_MDWE,
>>>> PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct
>>>> prctl(PR_SET_MDWE,
>>>> (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two 
>>>> possibilities
>>>> when prctl() implementation extracts the second argument via
>>>> va_arg(op,
>>>> unsigned long):
>>>> 
>>>> * It gets lucky, and the upper 32 bits of the argument are zero. The
>>>> call does what is expected by the user.
>>>> 
>>>> * The upper 32 bits are non-zero junk. The flags argument is 
>>>> rejected
>>>> by
>>>> the kernel, and the call fails with EINVAL (unexpectedly for the
>>>> user).
>>>> 
>>>> This change is intended to affect only the second case, and only 
>>>> after
>>>> the program is recompiled with the new uapi headers. The currently
>>>> wrong, but naturally-looking prctl(PR_SET_MDWE,
>>>> PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct.
>>>> 
>>>> The kernel ABI is unaffected by this change, since it has been 
>>>> defined
>>>> in terms of unsigned long from the start.
>>> 
>>> The thing I'm concerned about is the following: old user space (that
>>> would fail) on new kernel where we define some upper 32bit to 
>>> actually
>>> have a meaning (where it would succeed with wrong semantics).
>>> 
>>> IOW, can we ever really "use" these upper 32bit, or should we instead
>>> only consume the lower 32bit in the kernel and effectively ignore the
>>> upper 32bit?
>>> 
>> I see, thanks. But I think this question is mostly independent from 
>> this
>> patch. The patch removes a footgun, but it doesn't change the flags
>> check in the kernel, and nothing stops the user from doing
>> 
>> int flags = PR_MDWE_REFUSE_EXEC_GAIN;
>> prctl(PR_SET_MDWE, flags);
>> 
>> So we have to decide whether to ignore the upper 32 bits or not even 
>> if
>> this patch is not applied (actually *had to* when PR_SET_MDWE op was
>> being added).
> 
> Well, an alternative to this patch would be to say "well, for this
> prctl we ignore any upper 32bit. Then, this change would not be
> needed. Yes, I also don't like that :)
> 
> Bu unrelated, I looked at some other random prctl.
> 
> #define SUID_DUMP_USER           1
> 
> And in kernel/sys.c:
> 
> 	case PR_SET_DUMPABLE:
> 		if (arg2 != SUID_DUMP_DISABLE && arg2 != SUID_DUMP_USER)
> 	...
> 
> Wouldn't that also suffer from the same issue, or how is this 
> different?
> 
Yes, it is the same issue, so e.g. prctl(PR_SET_DUMPABLE, 
SUID_DUMP_DISABLE ) may wrongly fail with EINVAL on 64-bit targets.

> Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We
> need arg2 -> arg5 to be 0. But wouldn't the following also just pass a
> 0 "int" ?
> 
> prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0)
> 
Yes, this is not reliable on 64-bit targets too. The simplest fix is to 
use "0L", as done in MDWE self-tests (but many other tests get this 
wrong).

Florent also expressed surprise[1] that we don't see a lot of failures 
due to such issues, and I tried to provide some reasons. To elaborate on 
the x86-64 thing, for prctl(PR_SET_DUMPABLE, 0) the compiler will likely 
generate "xorl %esi, %esi" to pass zero, but this instruction will also 
clear the upper 32 bits of %rsi, so the problem is masked (and I believe 
CPU vendors are motivated to do such zeroing to reduce false 
dependencies). But this zeroing is not required by the ABI, so in a more 
complex situation junk might get through.

Real-world examples of very similar breakage in variadic functions 
involving NULL sentinels are mentioned in [2] (the musl bug report is 
[3]). In short, musl defined NULL as plain 0 for C++, so when people do 
e.g. execl("/bin/true", "true", NULL), junk might prevent detection of 
the sentinel in execl() impl. (Though if the sentinel is passed via 
stack because there are a lot of preceding arguments, the breakage 
becomes more apparent because auto-zeroing of registers doesn't come 
into play anymore.)

> 
> I'm easily confused by such (va_args) things, so sorry for the dummy 
> questions.

This stuff *is* confusing, and note that Linux man pages don't even tell 
that prctl() is actually declared as a variadic function (and for 
ptrace() this is mentioned only in the notes, but not in its signature).

Thanks,
Alexey

[1] 
https://lore.kernel.org/lkml/3a38319a3b241e578729ffa5484ad24b@ispras.ru
[2] https://ewontfix.com/11
[3] https://www.openwall.com/lists/musl/2013/01/09/1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ