[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMzpN2g4B4X8GkgZMWv5=tz94Hsf3F+Mc133gnAf7j7qvD0WAQ@mail.gmail.com>
Date: Mon, 23 Mar 2015 11:24:01 -0400
From: Brian Gerst <brgerst@...il.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Denys Vlasenko <dvlasenk@...hat.com>,
Andy Lutomirski <luto@...capital.net>,
Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] x86: execve and sigreturn syscalls must return via iret
On Mon, Mar 23, 2015 at 3:56 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
> * Brian Gerst <brgerst@...il.com> wrote:
>
>> Both the execve and sigreturn family of syscalls have the ability to change
>> registers in ways that may not be compatabile with the syscall path they
>> were called from. In particular, sysret and sysexit can't handle non-default
>> %cs and %ss, and some bits in eflags. These syscalls have stubs that are
>> hardcoded to jump to the iret path, and not return to the original syscall
>> path. Commit 76f5df43cab5e765c0bd42289103e8f625813ae1 (Always allocate a
>> complete "struct pt_regs" on the kernel stack) recently changed this for
>> some 32-bit compat syscalls, but introduced a bug where execve from a 32-bit
>> program to a 64-bit program would fail because it still returned via sysretl.
>> This caused Wine to fail when built for both 32-bit and 64-bit.
>>
>> This patch sets TIF_NOTIFY_RESUME for execve and sigreturn so that the iret
>> path is always taken on exit to userspace.
>>
>> Signed-off-by: Brian Gerst <brgerst@...il.com>
>> Cc: Ingo Molnar <mingo@...nel.org>
>> Cc: Denys Vlasenko <dvlasenk@...hat.com>
>> Cc: Andy Lutomirski <luto@...capital.net>
>> Cc: Borislav Petkov <bp@...en8.de>
>> Cc: H. Peter Anvin <hpa@...or.com>
>> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
>> ---
>> arch/x86/ia32/ia32_signal.c | 2 ++
>> arch/x86/include/asm/ptrace.h | 2 +-
>> arch/x86/include/asm/thread_info.h | 7 +++++++
>> arch/x86/kernel/process_32.c | 6 +-----
>> arch/x86/kernel/process_64.c | 1 +
>> arch/x86/kernel/signal.c | 2 ++
>> 6 files changed, 14 insertions(+), 6 deletions(-)
>
> Applied the fix to tip:x86/asm, thanks Brian!
>
>> +
>> +/*
>> + * force syscall return via iret by making it look as if there was
>> + * some work pending.
>> +*/
>> +#define force_iret() set_thread_flag(TIF_NOTIFY_RESUME)
>
> I extended this comment to:
>
> /*
> * Force syscall return via IRET by making it look as if there was
> * some work pending. IRET is our most capable (but slowest) syscall
> * return path, which is able to restore modified SS, CS and certain
> * EFLAGS values that other (fast) syscall return instructions
> * are not able to restore properly.
> */
> #define force_iret() set_thread_flag(TIF_NOTIFY_RESUME)
>
> Just to preserve the underlying reason for force_iret() for the future
> and such.
>
> Btw., it might be a worthwile optimization to detect non-standard SS,
> CS and EFLAGS values and only force_iret() in that case, that will
> speed up 99.9999% of execve() and sigreturn() syscalls and only force
> the 'weird' process startup modes into the slow return path.
sysret/sysexit also can't restore rcx and r11/rdx. This would not
work for execve, since it sets those registers to zero. It could
possibly work for sigreturn if the signal interrupted a syscall. We
already have the opportunistic sysret code for 64-bit returns.
--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists