[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5C4C569E8A4B9B42A84A977CF070A35B2C56C9B8AD@USINDEVS01.corp.hds.com>
Date: Tue, 27 Sep 2011 15:46:08 -0400
From: Seiji Aguchi <seiji.aguchi@....com>
To: "Luck, Tony" <tony.luck@...el.com>, Don Zickus <dzickus@...hat.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Vivek Goyal <vgoyal@...hat.com>,
Matthew Garrett <mjg@...hat.com>,
"Chen, Gong" <gong.chen@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"dle-develop@...ts.sourceforge.net"
<dle-develop@...ts.sourceforge.net>,
Satoru Moriya <satoru.moriya@....com>
Subject: RE: [RFC][PATCH -next] pstore: replace spin_lock with
spin_trylock_irqsave in panic path
Hi,
>Yes we care - saving panic data is most likely the single most important
>thing that pstore does. I just have severe doubts that it will actually
>save anything useful if we just blindly continue if we can't get the lock.
I agree with Tony. We may not get useful information if pstore just blindly continues
while other cpus are running.
>Is this patch based on a real-life case of a system deadlocking? I'd
>like to know if we are just talking around the theoretical case that
>the lock may be held at panic time - or something that has actually been
>seen in real life.
This patch is _not_ based on real-life case. I would like to avoid potential deadlock.
If Don disagrees to my "return" code, I have another idea which moves pstore_dump() behind smp_send_stop().
smp_send_stop() stops other cpus by sending IPI.
So pstore can continue reliably and get useful information by just busting spinlock.
It depends on each backend driver whether it actually accesses to NVRAM/storage.
Idea
====
Panic()
|- smp_send_stop() (Send IPI to other cpus)
|- bust spin_lock(&psinfo->buf_lock)
|- call pstore_dump()
Seiji
>-----Original Message-----
>From: Luck, Tony [mailto:tony.luck@...el.com]
>Sent: Tuesday, September 27, 2011 3:03 PM
>To: Don Zickus
>Cc: Seiji Aguchi; linux-kernel@...r.kernel.org; Vivek Goyal; Matthew Garrett; Chen, Gong; Andrew Morton;
>dle-develop@...ts.sourceforge.net; Satoru Moriya
>Subject: RE: [RFC][PATCH -next] pstore: replace spin_lock with spin_trylock_irqsave in panic path
>
>> Ok. Do we care? I assumed the panic data would be more
>> relevant/interesting than whatever pstore was doing before (like loading
>> previous log files).
>
>Yes we care - saving panic data is most likely the single most important
>thing that pstore does. I just have severe doubts that it will actually
>save anything useful if we just blindly continue if we can't get the lock.
>
>What actually happens next will be dependent on the back-end. For
>the state machine in ERST, one possible outcome is a hang. For many
>people a hang is considered worse than a panic.
>
>> I assumed we are just overwriting the buffer with the current data, so
>> unless the other cpu is chugging along while this cpu is in panic, the new
>> data shouldn't get corrupted, no?
>
>I really have no idea what *will* happen. Lots of things are possible, only
>some of them are desirable.
>
>Is this patch based on a real-life case of a system deadlocking? I'd
>like to know if we are just talking around the theoretical case that
>the lock may be held at panic time - or something that has actually been
>seen in real life.
>
>-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists