[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHp75VfOvABsQyxdy9j-On6pTunM1+uisoWQOmoNa7wLWJ+CSw@mail.gmail.com>
Date: Wed, 20 May 2020 11:32:32 +0300
From: Andy Shevchenko <andy.shevchenko@...il.com>
To: Emmanuel Grumbach <egrumbach@...il.com>
Cc: Brian Norris <briannorris@...omium.org>,
Luis Chamberlain <mcgrof@...nel.org>,
Johannes Berg <johannes@...solutions.net>,
linux-wireless <linux-wireless@...r.kernel.org>,
aquini@...hat.com, "Peter Zijlstra (Intel)" <peterz@...radead.org>,
Daniel Vetter <daniel.vetter@...ll.ch>,
Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
Will Deacon <will@...nel.org>, Baoquan He <bhe@...hat.com>,
ath10k@...ts.infradead.org, Takashi Iwai <tiwai@...e.de>,
Ingo Molnar <mingo@...hat.com>, Dave Young <dyoung@...hat.com>,
Petr Mladek <pmladek@...e.com>,
Kees Cook <keescook@...omium.org>,
Arnd Bergmann <arnd@...db.de>, gpiccoli@...onical.com,
Steven Rostedt <rostedt@...dmis.org>, cai@....pw,
Thomas Gleixner <tglx@...utronix.de>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Kalle Valo <kvalo@...eaurora.org>,
"<netdev@...r.kernel.org>" <netdev@...r.kernel.org>,
schlad@...e.de, Linux Kernel <linux-kernel@...r.kernel.org>,
Jessica Yu <jeyu@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH v2 12/15] ath10k: use new module_firmware_crashed()
On Wed, May 20, 2020 at 8:40 AM Emmanuel Grumbach <egrumbach@...il.com> wrote:
> Since I have been involved quite a bit in the firmware debugging
> features in iwlwifi, I think I can give a few insights here.
>
> But before this, we need to understand that there are several sources of issues:
> 1) the firmware may crash but the bus is still alive, you can still
> use the bus to get the crash data
> 2) the bus is dead, when that happens, the firmware might even be in a
> good condition, but since the bus is dead, you stop getting any
> information about the firmware, and then, at some point, you get to
> the conclusion that the firmware is dead. You can't get the crash data
> that resides on the other side of the bus (you may have gathered data
> in the DRAM directly, but that's a different thing), and you don't
> have much recovery to do besides re-starting the PCI enumeration.
>
> At Intel, we have seen both unfortunately. The bus issues are the ones
> that are trickier obviously. Trickier to detect (because you just get
> garbage from any request you issue on the bus), and trickier to
> handle. One can argue that the kernel should *not* handle those and
> let this in userspace hands. I guess it all depends on what component
> you ship to your customer and what you customer asks from you :).
Or the two best approaches:
1) get rid of firmware completely;
2) make it OSS (like SOF).
I think any of these is a right thing to do in long-term perspective.
How many firmwares average computer has? 50? 100? Any of them is a
burden and PITA.
--
With Best Regards,
Andy Shevchenko
Powered by blists - more mailing lists