[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F1936EADE@ORSMSX104.amr.corp.intel.com>
Date: Thu, 19 Jul 2012 23:42:26 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Seiji Aguchi <seiji.aguchi@....com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mikew@...gle.com" <mikew@...gle.com>,
"dzickus@...hat.com" <dzickus@...hat.com>,
"Matthew Garrett (mjg@...hat.com)" <mjg@...hat.com>
CC: "dle-develop@...ts.sourceforge.net"
<dle-develop@...ts.sourceforge.net>,
Satoru Moriya <satoru.moriya@....com>
Subject: RE: [RFC][PATCH v2 2/3] Hold multiple logs
> If you are concerned about multiple OOPS case, I think an user app which logs from /dev/pstore to /var/log should be developed.
Agreed - we need an app/daemon to do this.
> Once it is developed, we don't need to care about multiple oops case and the appropriate number is two.
Only if you can guarantee that the app/daemon will run and save the first OOPS before the next
occurs. Even if the system were running normally this might be difficult to achieve ... but in this
case we know the system isn't running normally (it just OOPSed twice!).
However - there is progressively less value in collecting additional consecutive OOPS. Perhaps
one is enough 90% or even 99% of the time. I'm naturally paranoid so having two or three
would make me feel happy that most of the remaining 10% or 1% of the cases were covered.
> - In case where system is workable after oops.
> The user app will erase an entry in NVRAM.
> And we can get the message via /var/log.
Yes - the system can keep running after many types of OOPs - so the OOPS will be logged in /var/log (or by the app/daemon
copying from pstore, or both).
> - In case where system hangs up or panics due to the oops.
> Oops is the critical message and we don't need care about subsequent events.
Yes - if the OOPs is instrumental in the path leading to the hang/panic - then the OOPS is the
first place to look for the root cause of the problem. But it will be a case by case analysis.
Sometimes the OOPS might be unconnected. If possible we'd like to log more information
to allow detective work to decide whether there is a connection. But as I mentioned above
there are severe limits to how much better things are by storing more information.
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists