[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXw0-KCSrjBDeKSp8GZQ8hmJXqH+2=UVPurgL7cAK6iJw@mail.gmail.com>
Date: Mon, 23 May 2016 18:23:31 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andi Kleen <andi@...stfloor.org>, Borislav Petkov <bp@...en8.de>,
"the arch/x86 maintainers" <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Oleg Nesterov <oleg@...hat.com>,
Tony Luck <tony.luck@...el.com>
Subject: Re: [PATCH v3 3/3] sched, x86: Check that we're on the right stack in
schedule and __might_sleep
On Sun, Feb 28, 2016 at 9:27 PM, Andy Lutomirski <luto@...capital.net> wrote:
> On Wed, Nov 19, 2014 at 11:44 AM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>> On Wed, Nov 19, 2014 at 11:29 AM, Andi Kleen <andi@...stfloor.org> wrote:
>>>
>>> The exception handlers which use the IST stacks don't necessarily
>>> set irq count. Maybe they should.
>>
>> Hmm. I think they should. Since they clearly must not schedule, as
>> they use a percpu stack.
>>
>> Which exceptions use IST?
>>
>> [ grep grep ]
>>
>> Looks like stack, doublefault, nmi, debug and mce. And yes, I really
>> think they should all raise the irq count if they don't already.
>> Rather than add random arch-specific "let's check that we're on the
>> right stack" code to the might-sleep stuff, just use the one we have.
>>
>
> Resurrecting an old thread:
>
> The outcome of this discussion was that ist_enter now raises
> HARDIRQ_COUNT. I think this is causing a problem. If a user program
> enables TF, it generates a bunch of debug exceptions. The handlers
> raise the IRQ count and do stuff, and apparently some of that stuff
> can raise a softirq. (I have no idea where the softirq is being
> raised.) The softirq code notices that we're in_interrupt and doesn't
> wake ksoftirqd because it thinks we're about to exit the interrupt and
> process the softirq. But we don't, which causes occasional warnings
> and confuses things (and me!).
>
> So how do we fix it? If we stop raising HARDIRQ_COUNT (and apply
> $SUBJECT?), then raise_softirq will wake ksoftirqd and life is good.
> But this seems a bit silly, since, if we entered the ist exception
> handler from a context with irqs on and softirqs enabled, we *could*
> plausibly handle the softirq right away -- we're on an essentially
> empty stack. (Of course, it's a *small* stack, since it could be the
> IST stack.)
>
> Or we could just let ksoftirqd do its thing and stop raising
> HARDIRQ_COUNT. We could add a new preempt count field just for IST
> (yuck). We could try to hijack a different preempt count field
> (NMI?). But I kind of like the idea of just reinstating the original
> patch of explicitly checking that we're on a safe stack in schedule
> and __might_sleep, since that is the actual condition we care about.
Ping? I can still trigger this fairly easily on 4.6.
--Andy
Powered by blists - more mailing lists