[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a91012e9cde9046d98713835476cab4b@ispras.ru>
Date: Tue, 23 May 2023 17:46:52 +0300
From: Alexey Izbyshev <izbyshev@...ras.ru>
To: Catalin Marinas <catalin.marinas@....com>
Cc: David Hildenbrand <david@...hat.com>,
Florent Revest <revest@...omium.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
akpm@...ux-foundation.org, anshuman.khandual@....com,
joey.gouly@....com, mhocko@...e.com, keescook@...omium.org,
peterx@...hat.com, broonie@...nel.org, szabolcs.nagy@....com,
kpsingh@...nel.org, gthelen@...gle.com, toiwoton@...il.com
Subject: Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
On 2023-05-23 17:09, Catalin Marinas wrote:
> On Tue, May 23, 2023 at 04:25:45PM +0300, Alexey Izbyshev wrote:
>> On 2023-05-23 16:07, Catalin Marinas wrote:
>> > On Tue, May 23, 2023 at 11:12:37AM +0200, David Hildenbrand wrote:
>> > > Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We
>> > > need arg2
>> > > -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ?
>> > >
>> > > prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0)
>> > >
>> > > I'm easily confused by such (va_args) things, so sorry for the dummy
>> > > questions.
>> >
>> > Isn't the prctl() prototype in the user headers defined with the first
>> > argument as int while the rest as unsigned long? At least from the man
>> > page:
>> >
>> > int prctl(int option, unsigned long arg2, unsigned long arg3,
>> > unsigned long arg4, unsigned long arg5);
>> >
>> > So there are no va_args tricks (which confuse me as well).
>> >
>> I have explicitly mentioned the problem with man pages in my response
>> to
>> David[1]. Quoting myself:
>>
>> > This stuff *is* confusing, and note that Linux man pages don't even tell
>> that prctl() is actually declared as a variadic function (and for
>> ptrace() this is mentioned only in the notes, but not in its
>> signature).
>
> Ah, thanks for the clarification (I somehow missed your reply).
>
>> The reality:
>>
>> * glibc:
>> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sys/prctl.h;h=821aeefc1339b35210e8918ecfe9833ed2792626;hb=glibc-2.37#l42
>>
>> * musl:
>> https://git.musl-libc.org/cgit/musl/tree/include/sys/prctl.h?h=v1.2.4#n180
>>
>> Though there is a test in the kernel that does define its own
>> prototype,
>> avoiding the issue:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/sched/cs_prctl_test.c?h=v6.3#n77
>
> At least for glibc, it seems that there is a conversion to unsigned
> long:
>
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/prctl.c#l28
>
> unsigned long int arg2 = va_arg (arg, unsigned long int);
>
> (does va_arg expand to an actual cast?)
>
No, this not a conversion or a cast in the sense that I think you mean
it. What happens in the situation discussed in this thread is the
following (assuming the argument is passed via a register, which is
typical for initial variadic arguments on 64-bit targets):
* User calls prctl(op, 0) on a 64-bit target.
* The second argument is an int.
* The compiler generates code to pass an int (32 bits) via a 64-bit
register. The compiler is NOT required to clear the upper 32 bits of the
register, so they might contain arbitrary junk in a general case.
* The prctl() implementation calls va_arg(arg, unsigned long) (as in
your quote).
* The compiler extracts the full 64-bit value of the same register
(which in our case might contain junk in the upper 32 bits).
* This extracted 64-bit value is then passed to the system call.
So...
> If the libc passes a 32-bit to a kernel ABI that expects 64-bit, I
> think
> it's a user-space bug and not a kernel ABI issue.
... the problem happens not at the user/kernel boundary, but in prctl()
call/implementation in user space. But yes, it's still a user-space bug
and not a kernel ABI issue. The David's question, as I understand it,
was whether we want to keep such buggy code that happens to pass junk
failing with EINVAL in future kernels or not. If we do want to keep it
failing, we can never assign any meaning to the upper 32 bits of the
second prctl() argument for PR_SET_MDWE op.
Thanks,
Alexey
Powered by blists - more mailing lists