[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55081C6D.6090609@redhat.com>
Date: Tue, 17 Mar 2015 13:22:05 +0100
From: Denys Vlasenko <dvlasenk@...hat.com>
To: Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...nel.org>
CC: linux-tip-commits@...r.kernel.org, linux-kernel@...r.kernel.org,
keescook@...omium.org, ast@...mgrid.com, fweisbec@...il.com,
oleg@...hat.com, tglx@...utronix.de, torvalds@...ux-foundation.org,
hpa@...or.com, wad@...omium.org, rostedt@...dmis.org
Subject: Re: [tip:x86/asm] x86/asm/entry/64: Remove unused thread_struct::usersp
On 03/17/2015 08:39 AM, Borislav Petkov wrote:
> On Tue, Mar 17, 2015 at 08:21:18AM +0100, Ingo Molnar wrote:
>> Assuming this does not fix the regression, could you apply the minimal
>> patch below - which reverts the old_rsp handling change.
>>
>> (The rest of the commit are in a third patch, but those are only
>> comment changes.)
>>
>> So my theory is that this change is what will revert the regression.
>
> Yep, it does. Below is the diff that works (it is the rough revert
> without the comments :-)):
>
...
> @@ -395,6 +398,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
> /*
> * Switch the PDA and FPU contexts.
> */
> + prev->usersp = this_cpu_read(old_rsp);
> + this_cpu_write(old_rsp, next->usersp);
I have a theory. There is a time window when user's sp
is in PER_CPU_VAR(old_rsp) but not yet in pt_regs->sp,
and *interrupts are enabled*:
ENTRY(system_call)
SWAPGS_UNSAFE_STACK
movq %rsp,PER_CPU_VAR(old_rsp)
movq PER_CPU_VAR(kernel_stack),%rsp
ENABLE_INTERRUPTS(CLBR_NONE)
ALLOC_PT_GPREGS_ON_STACK 8 /* +8: space for orig_ax */
movq %rcx,RIP(%rsp)
movq PER_CPU_VAR(old_rsp),%rcx
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
movq %r11,EFLAGS(%rsp)
movq %rcx,RSP(%rsp)
Before indicated insn, interrupts are already enabled.
If preempt would hit now, next task can clobber PER_CPU_VAR(old_rsp).
Then, when we return to this task, a bogus user's sp will be stored
in pt_regs, restores on exit to userspace, and next attempt
to, say, execute RETQ will try to pop a bogus, likely noncanonical
address into RIP -> #GP -> SEGV!
The theory can be tested by just moving interrupt enable a bit down:
ENTRY(system_call)
SWAPGS_UNSAFE_STACK
movq %rsp,PER_CPU_VAR(old_rsp)
movq PER_CPU_VAR(kernel_stack),%rsp
- ENABLE_INTERRUPTS(CLBR_NONE)
ALLOC_PT_GPREGS_ON_STACK 8 /* +8: space for orig_ax */
movq %rcx,RIP(%rsp)
movq PER_CPU_VAR(old_rsp),%rcx
movq %r11,EFLAGS(%rsp)
movq %rcx,RSP(%rsp)
+ ENABLE_INTERRUPTS(CLBR_NONE)
If I'm right, segfaults should be gone.
Borislav, can you try this?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists