[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58639bf9-b67c-0cbb-d4c0-69c4e400daff@candelatech.com>
Date: Thu, 28 May 2020 08:04:50 -0700
From: Ben Greear <greearb@...delatech.com>
To: Luis Chamberlain <mcgrof@...nel.org>,
Jakub Kicinski <kuba@...nel.org>
Cc: jeyu@...nel.org, davem@...emloft.net, michael.chan@...adcom.com,
dchickles@...vell.com, sburla@...vell.com, fmanlunas@...vell.com,
aelior@...vell.com, GR-everest-linux-l2@...vell.com,
kvalo@...eaurora.org, johannes@...solutions.net,
akpm@...ux-foundation.org, arnd@...db.de, rostedt@...dmis.org,
mingo@...hat.com, aquini@...hat.com, cai@....pw, dyoung@...hat.com,
bhe@...hat.com, peterz@...radead.org, tglx@...utronix.de,
gpiccoli@...onical.com, pmladek@...e.com, tiwai@...e.de,
schlad@...e.de, andriy.shevchenko@...ux.intel.com,
derosier@...il.com, keescook@...omium.org, daniel.vetter@...ll.ch,
will@...nel.org, mchehab+samsung@...nel.org, vkoul@...nel.org,
mchehab+huawei@...nel.org, robh@...nel.org, mhiramat@...nel.org,
sfr@...b.auug.org.au, linux@...inikbrodowski.net,
glider@...gle.com, paulmck@...nel.org, elver@...gle.com,
bauerman@...ux.ibm.com, yamada.masahiro@...ionext.com,
samitolvanen@...gle.com, yzaikin@...gle.com, dvyukov@...gle.com,
rdunlap@...radead.org, corbet@....net, dianders@...omium.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-wireless@...r.kernel.org
Subject: Re: [PATCH v3 0/8] kernel: taint when the driver firmware crashes
On 05/28/2020 07:27 AM, Luis Chamberlain wrote:
> On Wed, May 27, 2020 at 02:36:42PM -0700, Jakub Kicinski wrote:
>> On Wed, 27 May 2020 03:19:18 +0000 Luis Chamberlain wrote:
>>> I read your patch, and granted, I will accept I was under the incorrect
>>> assumption that this can only be used by networking devices, however it
>>> the devlink approach achieves getting userspace the ability with
>>> iproute2 devlink util to query a device health, on to which we can peg
>>> firmware health. But *this* patch series is not about health status and
>>> letting users query it, its about a *critical* situation which has come up
>>> with firmware requiring me to reboot my system, and the lack of *any*
>>> infrastructure in the kernel today to inform userspace about it.
>>>
>>> So say we use netlink to report a critical health situation, how are we
>>> informing userspace with your patch series about requring a reboot?
>>
>> One of main features of netlink is pub/sub model of notifications.
>>
>> Whatever you imagine listening to your uevent can listen to
>> devlink-health notifications via devlink.
>>
>> In fact I've shown this off in the RFC patches I sent to you, see
>> the devlink mon health command being used.
>
> Yes but I looked at iputils2 devlink and seems I made an incorrect
> assumption this can only be used for a network device rather than
> a struct device.
>
> I'll take a second look.
Hello Jakub,
I'm thinking about something similar to what Luis is proposing, but in
my case I'd like to report just when the driver knows the hardware is gone
and cannot be recovered, like when this is reported:
[ 2548.851832] WARNING: CPU: 3 PID: 98 at backports-4.19.98-1/net/mac80211/util.c:2040 ieee80211_reconfig+0x98/0xb64 [mac80211]
[ 2548.856020] Hardware became unavailable during restart.
I'd like to be able to tie this into a watch-dog program to allow automatic reboot
of the system soon after this event is seen, for instance.
Could you post your devlink RFC patches somewhere public?
Thanks,
Ben
--
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc http://www.candelatech.com
Powered by blists - more mailing lists