lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CALCETrXw0-KCSrjBDeKSp8GZQ8hmJXqH+2=UVPurgL7cAK6iJw@mail.gmail.com> Date: Mon, 23 May 2016 18:23:31 -0700 From: Andy Lutomirski <luto@...capital.net> To: Linus Torvalds <torvalds@...ux-foundation.org> Cc: Andi Kleen <andi@...stfloor.org>, Borislav Petkov <bp@...en8.de>, "the arch/x86 maintainers" <x86@...nel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Peter Zijlstra <peterz@...radead.org>, Oleg Nesterov <oleg@...hat.com>, Tony Luck <tony.luck@...el.com> Subject: Re: [PATCH v3 3/3] sched, x86: Check that we're on the right stack in schedule and __might_sleep On Sun, Feb 28, 2016 at 9:27 PM, Andy Lutomirski <luto@...capital.net> wrote: > On Wed, Nov 19, 2014 at 11:44 AM, Linus Torvalds > <torvalds@...ux-foundation.org> wrote: >> On Wed, Nov 19, 2014 at 11:29 AM, Andi Kleen <andi@...stfloor.org> wrote: >>> >>> The exception handlers which use the IST stacks don't necessarily >>> set irq count. Maybe they should. >> >> Hmm. I think they should. Since they clearly must not schedule, as >> they use a percpu stack. >> >> Which exceptions use IST? >> >> [ grep grep ] >> >> Looks like stack, doublefault, nmi, debug and mce. And yes, I really >> think they should all raise the irq count if they don't already. >> Rather than add random arch-specific "let's check that we're on the >> right stack" code to the might-sleep stuff, just use the one we have. >> > > Resurrecting an old thread: > > The outcome of this discussion was that ist_enter now raises > HARDIRQ_COUNT. I think this is causing a problem. If a user program > enables TF, it generates a bunch of debug exceptions. The handlers > raise the IRQ count and do stuff, and apparently some of that stuff > can raise a softirq. (I have no idea where the softirq is being > raised.) The softirq code notices that we're in_interrupt and doesn't > wake ksoftirqd because it thinks we're about to exit the interrupt and > process the softirq. But we don't, which causes occasional warnings > and confuses things (and me!). > > So how do we fix it? If we stop raising HARDIRQ_COUNT (and apply > $SUBJECT?), then raise_softirq will wake ksoftirqd and life is good. > But this seems a bit silly, since, if we entered the ist exception > handler from a context with irqs on and softirqs enabled, we *could* > plausibly handle the softirq right away -- we're on an essentially > empty stack. (Of course, it's a *small* stack, since it could be the > IST stack.) > > Or we could just let ksoftirqd do its thing and stop raising > HARDIRQ_COUNT. We could add a new preempt count field just for IST > (yuck). We could try to hijack a different preempt count field > (NMI?). But I kind of like the idea of just reinstating the original > patch of explicitly checking that we're on a safe stack in schedule > and __might_sleep, since that is the actual condition we care about. Ping? I can still trigger this fairly easily on 4.6. --Andy
Powered by blists - more mailing lists