[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWBpmS_W3sN3Pf-qRL8jgxoBoQWnQuk3r_TT_eBm9eNqQ@mail.gmail.com>
Date: Wed, 19 Nov 2014 16:46:29 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andi Kleen <andi@...stfloor.org>, Borislav Petkov <bp@...en8.de>,
"the arch/x86 maintainers" <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Oleg Nesterov <oleg@...hat.com>,
Tony Luck <tony.luck@...el.com>
Subject: Re: [PATCH v3 3/3] sched, x86: Check that we're on the right stack in
schedule and __might_sleep
On Wed, Nov 19, 2014 at 4:37 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Wed, Nov 19, 2014 at 4:13 PM, Andy Lutomirski <luto@...capital.net> wrote:
>>
>> No drugs, just imprecision. This series doesn't change NMI handling
>> at all. It only changes machine_check int3, debug, and stack_segment.
>> (Why is #SS using IST stacks anyway?)
>
> .. ok, we were talking about adding an explicit preemption count to
> nmi, and then you wanted to make that conditional, that kind of
> freaked me out.
I guess I jumped around in the conversation a bit...
>
>> So my point stands: if machine_check is going to be conditionally
>> atomic, then that condition needs to be expressed somewhere.
>
> I'd still prefer to keep that knowledge in one place, rather than
> adding *another* completely ad-hoc thing in addition to what we
> already have.
>
> Also, I really don't think it should be about the particular stack
> you're using. Sure, if a debug fault happens in user space, the fault
> handler could sleep if it runs on the regular stack, but our
> "might_sleep()" are about catching things that *could* be problematic,
> even if the sleep never happens. And so, might_sleep() _should_
> actually trigger, even if it's not using the IST stack, because *if*
> the debug exception happened in kernel space, then we should warn.
>
> So I'd actually *prefer* to have special hacks that perhaps then
> "undo" the preemption count if the code expressly tests for "did this
> happen in user space, then I know I'm safe". But then it's an
> *explicit* thing, not something that just magically works because
> nobody even thought about it, and the trap happened in user space.
>
> See the argument? I'd *rather* see code like
>
> /* Magic */
> if (user_mode(regs)) {
> .. verify that we're using the normal kernel stack
> .. enable interrupts, enable preemption
> .. this is the explicit special case and it is aware
> .. of being special
> }
>
> even if on the face of it it looks hacky. But an *explicit* hack is
> preferable to something that just "happens" to work only for the
> user-mode case.
So we'd do, in do_machine_check:
irq_enter();
do atomic stuff;
ist_stop_being_atomic(regs);
local_irq_enable();
...
local_irq_disable();
ist_start_being_atomic_again();
irq_exit();
and we'd have something like:
void ist_stop_being_atomic(struct pt_regs *regs)
{
BUG_ON(!user_mode_vm(regs));
--irq_count;
}
I'm very hesitant to use irq_enter for this, though. I think we want
just the irq_count part. Maybe ist_enter() and ist_exit()? I think
that we really don't want to go anywhere near the accounting stuff in
irq_enter from an IST handler if !user_mode_vm(regs). Doing it from
asm is somewhat less error prone, although I guess we already rely on
the IDT entries themselves being in sync with the paranoid idtentry
setting.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists