[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <A5ED84D3BB3A384992CBB9C77DEDA4D40FB28206@USINDEM103.corp.hds.com>
Date: Fri, 20 Jul 2012 00:39:24 +0000
From: Seiji Aguchi <seiji.aguchi@....com>
To: "Luck, Tony" <tony.luck@...el.com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mikew@...gle.com" <mikew@...gle.com>,
"dzickus@...hat.com" <dzickus@...hat.com>,
"Matthew Garrett (mjg@...hat.com)" <mjg@...hat.com>
CC: "dle-develop@...ts.sourceforge.net"
<dle-develop@...ts.sourceforge.net>,
Satoru Moriya <satoru.moriya@....com>
Subject: RE: [RFC][PATCH v2 2/3] Hold multiple logs
Thank you for describing this in detail.
> Yes - if the OOPs is instrumental in the path leading to the hang/panic - then the OOPS is the first place to look for the root cause of
> the problem. But it will be a case by case analysis.
> Sometimes the OOPS might be unconnected. If possible we'd like to log more information to allow detective work to decide whether
> there is a connection. But as I mentioned above there are severe limits to how much better things are by storing more information.
I understand the reason why you think 3 or 4 logs are reasonable.
There are some cases 2nd or 3rd oops is critical....
I have some enterprise customers who are sensitive for a software failure and specify panic_on_oops=1.
In this case, they don't need 3,4 logs. 2 logs are enough.
So, kernel parameter should be as follows.
Log_num =1
- For users who want to hold just one log.
Log_num=2
- For users who can handle multiple logs and 1st oops is concerned. (by specifying panic_on_oops=1)
Log_num=3,4
- for users who care about 2nd or 3rd oops.
Log_num=5 or more
Invalid value.
If there is misunderstanding, please let me know.
Seiji
> -----Original Message-----
> From: Luck, Tony [mailto:tony.luck@...el.com]
> Sent: Thursday, July 19, 2012 7:42 PM
> To: Seiji Aguchi; linux-doc@...r.kernel.org; linux-kernel@...r.kernel.org; mikew@...gle.com; dzickus@...hat.com; Matthew
> Garrett (mjg@...hat.com)
> Cc: dle-develop@...ts.sourceforge.net; Satoru Moriya
> Subject: RE: [RFC][PATCH v2 2/3] Hold multiple logs
>
> > If you are concerned about multiple OOPS case, I think an user app which logs from /dev/pstore to /var/log should be developed.
>
> Agreed - we need an app/daemon to do this.
>
> > Once it is developed, we don't need to care about multiple oops case and the appropriate number is two.
>
> Only if you can guarantee that the app/daemon will run and save the first OOPS before the next occurs. Even if the system were
> running normally this might be difficult to achieve.. but in this case we know the system isn't running normally (it just OOPSed twice!).
>
> However - there is progressively less value in collecting additional consecutive OOPS. Perhaps one is enough 90% or even 99% of the
> time. I'm naturally paranoid so having two or three would make me feel happy that most of the remaining 10% or 1% of the cases
> were covered.
>
> > - In case where system is workable after oops.
> > The user app will erase an entry in NVRAM.
> > And we can get the message via /var/log.
>
> Yes - the system can keep running after many types of OOPs - so the OOPS will be logged in /var/log (or by the app/daemon copying
> from pstore, or both).
>
> > - In case where system hangs up or panics due to the oops.
> > Oops is the critical message and we don't need care about subsequent events.
>
> Yes - if the OOPs is instrumental in the path leading to the hang/panic - then the OOPS is the first place to look for the root cause of
> the problem. But it will be a case by case analysis.
> Sometimes the OOPS might be unconnected. If possible we'd like to log more information to allow detective work to decide whether
> there is a connection. But as I mentioned above there are severe limits to how much better things are by storing more information.
>
> -Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists