[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTinSOFAioAe2v5c6PRB9EKjJJNMg9w@mail.gmail.com>
Date: Tue, 24 May 2011 10:56:26 -0700
From: Tony Luck <tony.luck@...el.com>
To: Borislav Petkov <bp@...64.org>
Cc: Ingo Molnar <mingo@...e.hu>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Huang, Ying" <ying.huang@...el.com>,
Andi Kleen <andi@...stfloor.org>,
Borislav Petkov <bp@...en8.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mauro Carvalho Chehab <mchehab@...hat.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [RFC 0/9] mce recovery for Sandy Bridge server
Dragging PeterZ to this thread, since we are now talking about scheduler.
On Tue, May 24, 2011 at 10:33 AM, Borislav Petkov <bp@...64.org> wrote:
> On Tue, May 24, 2011 at 09:57:46AM -0700, Luck, Tony wrote:
>> So can we talk about this part for a while before returning to the
>> "how to report this" discussion?
>>
>> So here's the situation - we are in the NMI handler when we find from
>> looking at the machine check bank registers that we have a recoverable
>> error. We know the physical address, and we know the task (which might
>> have been in user or kernel context). I can package that information
>> into a perf/event ... but then how can I mark the current task as
>> not-fit-for-execution?
>
> Maybe something like
>
> set_current_state(TASK_UNINTERRUPTIBLE);
>
> finish work in NMI context
>
> do remaining work in process context like sending appropriate signals
> etc; finally:
>
> set_task_state(tsk, TASK_RUNNING)
That looks pretty easy - are their any weird side effects that I should
be worried about? My perf/event can't really include the "task" pointer
(that sounds way too internal) - but I can provide the process id, so
the "RAS daemon" that sees this event can look up the task to do that
final set_task_state(tsk, TASK_RUNNING).
Does this work in the threaded case? In the case where the task was in
kernel context (but in a CONFIG_PREEMT=y kernel at some point
where preemption is allowed)?
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists