[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <091174F9-F6E4-468E-83F5-93706D83F9D2@amacapital.net>
Date: Mon, 4 Jan 2021 15:04:01 -0800
From: Andy Lutomirski <luto@...capital.net>
To: David Laight <David.Laight@...lab.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Al Viro <viro@...iv.linux.org.uk>,
Christoph Hellwig <hch@....de>, linux-kernel@...r.kernel.org,
X86 ML <x86@...nel.org>
Subject: Re: in_compat_syscall() on x86
> On Jan 4, 2021, at 2:36 PM, David Laight <David.Laight@...lab.com> wrote:
>
> From: Eric W. Biederman
>> Sent: 04 January 2021 20:41
>>
>> Al Viro <viro@...iv.linux.org.uk> writes:
>>
>>> On Mon, Jan 04, 2021 at 12:16:56PM +0000, David Laight wrote:
>>>> On x86 in_compat_syscall() is defined as:
>>>> in_ia32_syscall() || in_x32_syscall()
>>>>
>>>> Now in_ia32_syscall() is a simple check of the TS_COMPAT flag.
>>>> However in_x32_syscall() is a horrid beast that has to indirect
>>>> through to the original %eax value (ie the syscall number) and
>>>> check for a bit there.
>>>>
>>>> So on a kernel with x32 support (probably most distro kernels)
>>>> the in_compat_syscall() check is rather more expensive than
>>>> one might expect.
>>
>> I suggest you check the distro kernels. I suspect they don't compile in
>> support for x32. As far as I can tell x32 is an undead beast of a
>> subarchitecture that just enough people use that it can't be removed,
>> but few enough people use it likely has a few lurking scary bugs.
>
> It is defined in the Ubuntu kernel configs I've got lurking:
> Both 3.8.0-19_generic (Ubuntu 13.04) and 5.4.0-56_generic (probably 20.04).
> Which is probably why it is in my test builds (I've just cut out
> a lot of modules).
>
>>>> It would be muck better if both checks could be done together.
>>>> I think this would require the syscall entry code to set a
>>>> value in both the 64bit and x32 entry paths.
>>>> (Can a process make both 64bit and x32 system calls?)
>>>
>>> Yes, it bloody well can.
>>>
>>> And I see no benefit in pushing that logics into syscall entry,
>>> since anything that calls in_compat_syscall() more than once
>>> per syscall execution is doing the wrong thing. Moreover,
>>> in quite a few cases we don't call the sucker at all, and for
>>> all of those pushing that crap into syscall entry logics is
>>> pure loss.
>>
>> The x32 system calls have their own system call table and it would be
>> trivial to set a flag like TS_COMPAT when looking up a system call from
>> that table. I expect such a change would be purely in the noise.
>
> Certainly a write of 0/1/2 into a dirtied cache line of 'current'
> could easily cost absolutely nothing.
> Especially if current has already been read.
>
> I also wondered about resetting it to zero when an x32 system call
> exits (rather than entry to a 64bit one).
>
> For ia32 the flag is set (with |=) on every syscall entry.
> Even though I'm pretty sure it can only change during exec.
It can change for every syscall. I have tests that do this.
>
>>> What's the point, really?
>>
>> Before we came up with the current games with __copy_siginfo_to_user
>> and x32_copy_siginfo_to_user I was wondering if we should make such
>> a change. The delivery of compat signal frames and core dumps which
>> do not go through the system call entry path could almost benefit from
>> a flag that could be set/tested when on those paths.
>
> For signal delivery it should (probably) depend on the system call
> that setup the signal handler.
I think it has worked this way for some time now.
> Although I'm sure I remember one kernel where some of it was done
> in libc (with a single entrypoint for all hadlers).
>
>> The fact that only SIGCHLD (which can not trigger a coredump) is
>> different saves the coredump code from needing such a test.
>>
>> The fact that the signal frame code is simple enough it can directly
>> call x32_copy_siginfo_to_user or __copy_siginfo_to_user saves us there.
>>
>> So I don't think we have any cases where we actually need a flag that
>> is independent of the system call but we have come very close.
>
> If a program can do both 64bit and x32 system calls you probably
> need to generate a 64bit core dump if it has ever made a 64bit
> system call??
I think core dump should (and does) depend on the execution mode at the time of the crash.
It’s worth noting that GCC’s understanding of mixed bitness is horrible.
Powered by blists - more mailing lists