[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110708201731.GA3025@redhat.com>
Date: Fri, 8 Jul 2011 16:17:31 -0400
From: Don Zickus <dzickus@...hat.com>
To: tony.luck@...el.com
Cc: mjg@...hat.com, linux-kernel@...r.kernel.org
Subject: pstore dump inside an nmi handler
Hi Tony,
I was playing with the APEI EINJ module, injecting errors trying to
capture a GHES record, then panic into a kdump kernel and reboot.
Matthew brought to my attention that pstore should capture an error record
on the panic path using kmsg_dump(). After injecting an error with EINJ,
I went to check to see if there was a pstore entry. There wasn't.
Playing on another box, I noticed the machine double faulted and didn't
even make it into a kdump kernel.
Upon investigation, I noticed that when a fatal error occurs on the
platform, it will generate an NMI that will be handle by the
ghes_nmi_handler. This handler calls panic() which calls kmsg_dump()
which calls pstore_dump().
Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
(inside an nmi hander). This seems to be the root cause of my problems.
I am not familiar enough with pstore to just modify its locking, so I
wanted to ask you.
My first thought was to wrap the mutex_lock with a 'if !in_nmi()', but that
seemed kinda hacky. Then I was wondering if there was a way to do this
locklessly or atomically because you are only dealing with whole blocks I
think. I don't know.
Wanted to give you a heads up and seek your thoughts. I am willing to
hack up some code and test. :-)
Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists