linux-kernel - Re: [RFC 0/9] mce recovery for Sandy Bridge server

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1306272274.2497.73.camel@laptop>
Date:	Tue, 24 May 2011 23:24:34 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Tony Luck <tony.luck@...el.com>
Cc:	Borislav Petkov <bp@...64.org>, Ingo Molnar <mingo@...e.hu>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Huang, Ying" <ying.huang@...el.com>,
	Andi Kleen <andi@...stfloor.org>,
	Borislav Petkov <bp@...en8.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mauro Carvalho Chehab <mchehab@...hat.com>
Subject: Re: [RFC 0/9] mce recovery for Sandy Bridge server

On Tue, 2011-05-24 at 10:56 -0700, Tony Luck wrote:
> Dragging PeterZ to this thread, since we are now talking about scheduler.
> 
> On Tue, May 24, 2011 at 10:33 AM, Borislav Petkov <bp@...64.org> wrote:
> > On Tue, May 24, 2011 at 09:57:46AM -0700, Luck, Tony wrote:
> >> So can we talk about this part for a while before returning to the
> >> "how to report this" discussion?
> >>
> >> So here's the situation - we are in the NMI handler when we find from
> >> looking at the machine check bank registers that we have a recoverable
> >> error. We know the physical address, and we know the task (which might
> >> have been in user or kernel context). I can package that information
> >> into a perf/event ... but then how can I mark the current task as
> >> not-fit-for-execution?
> >
> > Maybe something like
> >
> > set_current_state(TASK_UNINTERRUPTIBLE);
> >
> > finish work in NMI context
> >
> > do remaining work in process context like sending appropriate signals
> > etc; finally:
> >
> > set_task_state(tsk, TASK_RUNNING)
> 
> That looks pretty easy - are their any weird side effects that I should
> be worried about?  My perf/event can't really include the "task" pointer
> (that sounds way too internal) - but I can provide the process id, so
> the "RAS daemon" that sees this event can look up the task to do that
> final set_task_state(tsk, TASK_RUNNING).
> 
> Does this work in the threaded case? In the case where the task was in
> kernel context (but in a CONFIG_PREEMT=y kernel at some point
> where preemption is allowed)?


Right, so you can't do things like that from NMI context, but what perf
can do is raise a self-IPI and continue from IRQ context (question for
the HW folks, can there be cycles between the NMI iret and IRQ assert
from whatever context was before the NMI hit?)

>From IRQ context we can wake threads, set TIF_flags etc. you can
basically do what SIGSTOP does and put the task in TASK_STOPPED state,
wake your handler thread and set TIF_NEED_RESCHED. Then the handler
thread will be scheduled depending on your handler's sched policy.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/