[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200522052834.GA11244@42.do-not-panic.com>
Date: Fri, 22 May 2020 05:28:34 +0000
From: Luis Chamberlain <mcgrof@...nel.org>
To: Alex Elder <elder@...e.org>
Cc: jeyu@...nel.org, akpm@...ux-foundation.org, arnd@...db.de,
rostedt@...dmis.org, mingo@...hat.com, aquini@...hat.com,
cai@....pw, dyoung@...hat.com, bhe@...hat.com,
peterz@...radead.org, tglx@...utronix.de, gpiccoli@...onical.com,
pmladek@...e.com, tiwai@...e.de, schlad@...e.de,
andriy.shevchenko@...ux.intel.com, keescook@...omium.org,
daniel.vetter@...ll.ch, will@...nel.org,
mchehab+samsung@...nel.org, kvalo@...eaurora.org,
davem@...emloft.net, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, Alex Elder <elder@...nel.org>
Subject: Re: [PATCH v2 10/15] soc: qcom: ipa: use new
module_firmware_crashed()
On Tue, May 19, 2020 at 05:34:13PM -0500, Alex Elder wrote:
> On 5/15/20 4:28 PM, Luis Chamberlain wrote:
> > This makes use of the new module_firmware_crashed() to help
> > annotate when firmware for device drivers crash. When firmware
> > crashes devices can sometimes become unresponsive, and recovery
> > sometimes requires a driver unload / reload and in the worst cases
> > a reboot.
> >
> > Using a taint flag allows us to annotate when this happens clearly.
>
> I don't fully understand what this is meant to do, so I can't
> fully assess whether it's the right thing to do.
It is meant to taint the kernel to ensure it is clear that something
critically bad has happened with the device firmware, it crashed, and
recovery may or may not happen, we are not 100% certain.
>
> But in this particular place in the IPA code, the *modem* has
> crashed. And the IPA driver is not responsible for modem
> firmware, remoteproc is.
Oi vei. So the device it depends on has crashed.
> The IPA driver *can* be responsible for loading some other
> firmware, but even in that case, it only happens on initial
> boot, and it's basically assumed to never crash.
OK is this an issue which we can recover from? If for the slightest bit
this can affect users it is something we should inform them over.
This patch set is missing uevents for these issues, but I just added
support for this.
> So regardless of whether this module_firmware_crashed() call is
> appropriate in some places, I believe it should not be used here.
OK thanks. Can the user be affected by this crash? If so how? Can
we recover ? Is that always guaranteed?
Luis
Powered by blists - more mailing lists