[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181129094641.GD2131@hirez.programming.kicks-ass.net>
Date: Thu, 29 Nov 2018 10:46:41 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Nadav Amit <namit@...are.com>, Ingo Molnar <mingo@...hat.com>,
Andrew Lutomirski <luto@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
Borislav Petkov <bp@...en8.de>,
"Woodhouse, David" <dwmw@...zon.co.uk>
Subject: Re: [RFC PATCH 1/5] x86: introduce preemption disable prefix
On Fri, Oct 19, 2018 at 07:29:45AM -0700, Andy Lutomirski wrote:
> > On Oct 19, 2018, at 1:33 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> >> On Fri, Oct 19, 2018 at 01:08:23AM +0000, Nadav Amit wrote:
> >> Consider for example do_int3(), and see my inlined comments:
> >>
> >> dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
> >> {
> >> ...
> >> ist_enter(regs); // => preempt_disable()
> >> cond_local_irq_enable(regs); // => assume it enables IRQs
> >>
> >> ...
> >> // resched irq can be delivered here. It will not caused rescheduling
> >> // since preemption is disabled
> >>
> >> cond_local_irq_disable(regs); // => assume it disables IRQs
> >> ist_exit(regs); // => preempt_enable_no_resched()
> >> }
> >>
> >> At this point resched will not happen for unbounded length of time (unless
> >> there is another point when exiting the trap handler that checks if
> >> preemption should take place).
> >>
> >> Another example is __BPF_PROG_RUN_ARRAY(), which also uses
> >> preempt_enable_no_resched().
> >>
> >> Am I missing something?
> >
> > Would not the interrupt return then check for TIF_NEED_RESCHED and call
> > schedule() ?
>
> The paranoid exit path doesn’t check TIF_NEED_RESCHED because it’s
> fundamentally atomic — it’s running on a percpu stack and it can’t
> schedule. In theory we could do some evil stack switching, but we
> don’t.
>
> How does NMI handle this? If an NMI that hit interruptible kernel
> code overflows a perf counter, how does the wake up work?
NMIs should never set NEED_RESCHED. What the perf does it self-IPI
(irq_work) and do the wakeup from there.
Powered by blists - more mailing lists