[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <63abd72b-11fd-0249-0a6f-83d4d2aa8bb3@virtuozzo.com>
Date: Tue, 21 Mar 2017 23:04:00 +0300
From: Dmitry Safonov <dsafonov@...tuozzo.com>
To: Andy Lutomirski <luto@...capital.net>
CC: Cyrill Gorcunov <gorcunov@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Dmitry Safonov <0x7f454c46@...il.com>,
Adam Borowski <kilobyte@...band.pl>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Andrei Vagin <avagin@...il.com>, Borislav Petkov <bp@...e.de>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCHv2] x86/mm: set x32 syscall bit in SET_PERSONALITY()
On 03/21/2017 10:42 PM, Dmitry Safonov wrote:
> On 03/21/2017 10:31 PM, Andy Lutomirski wrote:
>> On Tue, Mar 21, 2017 at 11:09 AM, Dmitry Safonov
>> <dsafonov@...tuozzo.com> wrote:
>>> On 03/21/2017 08:45 PM, Andy Lutomirski wrote:
>>>>
>>>> On Tue, Mar 21, 2017 at 10:17 AM, Cyrill Gorcunov <gorcunov@...il.com>
>>>> wrote:
>>>>>
>>>>> On Tue, Mar 21, 2017 at 07:37:12PM +0300, Dmitry Safonov wrote:
>>>>> ...
>>>>>>
>>>>>> diff --git a/arch/x86/kernel/process_64.c
>>>>>> b/arch/x86/kernel/process_64.c
>>>>>> index d6b784a5520d..d3d4d9abcaf8 100644
>>>>>> --- a/arch/x86/kernel/process_64.c
>>>>>> +++ b/arch/x86/kernel/process_64.c
>>>>>> @@ -519,8 +519,14 @@ void set_personality_ia32(bool x32)
>>>>>> if (current->mm)
>>>>>> current->mm->context.ia32_compat = TIF_X32;
>>>>>> current->personality &= ~READ_IMPLIES_EXEC;
>>>>>> - /* in_compat_syscall() uses the presence of the x32
>>>>>> - syscall bit flag to determine compat status */
>>>>>> + /*
>>>>>> + * in_compat_syscall() uses the presence of the x32
>>>>>> + * syscall bit flag to determine compat status.
>>>>>> + * On the bitness of syscall relies x86 mmap() code,
>>>>>> + * so set x32 syscall bit right here to make
>>>>>> + * in_compat_syscall() work during exec().
>>>>>> + */
>>>>>> + task_pt_regs(current)->orig_ax |= __X32_SYSCALL_BIT;
>>>>>> current->thread.status &= ~TS_COMPAT;
>>>>>
>>>>>
>>>>> Hi! I must admit I didn't follow close the overall series (so can't
>>>>> comment much here :) but I have a slightly unrelated question -- is
>>>>> there a way to figure out if task is running in x32 mode say with
>>>>> some ptrace or procfs sign?
>>>>
>>>>
>>>> You should be able to figure out of a *syscall* is x32 by simply
>>>> looking at bit 30 in the syscall number. (This is unlike i386, which
>>>> is currently not reflected in ptrace.)
>>>
>>>
>>> The process could be stopped with PTRACE_SEIZE and I think, it'll not
>>> have x32 syscall bit at that moment.
>>>
>>> I guess the question comes from that we're releasing CRIU 3.0 with
>>> 32-bit C/R and some other cool stuff, but we don't support x32 yet.
>>> As we don't want release a thing that we aren't properly testing.
>>> So for a while we should error on dumping x32 applications.
>>
>> I'm curious: shouldn't x32 CRIU just work? What goes wrong?
>
> I also think, it should be quite easy to add, as we have arch_prctl()
> for vdso and etc.
> But there are things, which will not work if we just dump application
> as 64-bit.
>
> For example, what comes to mind: sys_get_robust_list(), it has different
> pointers for 64-bit or for x32/ia32 applications: robust_list
> and compat_robust_list. So during C/R we should sometimes call
> compatible syscalls for x32 applications to dump/restore, as for futex
> list e.g., native will return NULL or empty list.
Maybe we should just save both pointers with CRIU for simplicity.
Which will add additional syscall for most applications that define only
one of compat/native lists.
I think, there are some other things like that, but it's end of the day
and nothing crosses my mind.
Anyway, I wouldn't want release anything without adding it to regular
tests, so that would need also some time to do. And a funny thing: there
are many folks which runs 32-bit containers on x86_64 to save memory,
but they use ia32, not x32. Maybe because of envoironment which is
easier to get (for x32 there are no templates for example). Maybe just
because.
So, yet I haven't saw any request for x32 C/R while for ia32 there is
crowded room. And quite many people for arm/arm64 where kernel doesn't
support yet vdso mremap() and CRIU doesn't run test for them regulary
yet.
But well, x32 still should be quite easy to add, say, for the next
release after ia32 C/R not sure if it'll be planned though.
--
Dmitry
Powered by blists - more mailing lists