lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2aaddb69-2292-ff3f-94c7-0ab9dbc8e53c@marvell.com>
Date:   Sat, 9 May 2020 09:32:51 +0300
From:   Igor Russkikh <irusskikh@...vell.com>
To:     Luis Chamberlain <mcgrof@...nel.org>, <jeyu@...nel.org>
CC:     <akpm@...ux-foundation.org>, <arnd@...db.de>,
        <rostedt@...dmis.org>, <mingo@...hat.com>, <aquini@...hat.com>,
        <cai@....pw>, <dyoung@...hat.com>, <bhe@...hat.com>,
        <peterz@...radead.org>, <tglx@...utronix.de>,
        <gpiccoli@...onical.com>, <pmladek@...e.com>, <tiwai@...e.de>,
        <schlad@...e.de>, <andriy.shevchenko@...ux.intel.com>,
        <keescook@...omium.org>, <daniel.vetter@...ll.ch>,
        <will@...nel.org>, <mchehab+samsung@...nel.org>,
        <kvalo@...eaurora.org>, <davem@...emloft.net>,
        <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        Ariel Elior <aelior@...vell.com>,
        GR-everest-linux-l2 <GR-everest-linux-l2@...vell.com>
Subject: Re: [EXT] [PATCH 09/15] qed: use new module_firmware_crashed()


> This makes use of the new module_firmware_crashed() to help
> annotate when firmware for device drivers crash. When firmware
> crashes devices can sometimes become unresponsive, and recovery
> sometimes requires a driver unload / reload and in the worst cases
> a reboot.
> 
> Using a taint flag allows us to annotate when this happens clearly.
> 
> Cc: Ariel Elior <aelior@...vell.com>
> Cc: GR-everest-linux-l2@...vell.com
> Signed-off-by: Luis Chamberlain <mcgrof@...nel.org>
> ---
>  drivers/net/ethernet/qlogic/qed/qed_debug.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_debug.c
> b/drivers/net/ethernet/qlogic/qed/qed_debug.c
> index f4eebaabb6d0..9cc6287b889b 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_debug.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_debug.c
> @@ -7854,6 +7854,7 @@ int qed_dbg_all_data(struct qed_dev *cdev, void
> *buffer)
>  						 REGDUMP_HEADER_SIZE,
>  						 &feature_size);
>  		if (!rc) {
> +			module_firmware_crashed();
>  			*(u32 *)((u8 *)buffer + offset) =
>  			    qed_calc_regdump_header(cdev,
> PROTECTION_OVERRIDE,
>  						    cur_engine,
> @@ -7869,6 +7870,7 @@ int qed_dbg_all_data(struct qed_dev *cdev, void
> *buffer)
>  		rc = qed_dbg_fw_asserts(cdev, (u8 *)buffer + offset +
>  					REGDUMP_HEADER_SIZE,
> &feature_size);
>  		if (!rc) {
> +			module_firmware_crashed();
>  			*(u32 *)((u8 *)buffer + offset) =
>  			    qed_calc_regdump_header(cdev, FW_ASSERTS,
>  						    cur_engine,
> feature_size,
> @@ -7906,6 +7908,7 @@ int qed_dbg_all_data(struct qed_dev *cdev, void
> *buffer)
>  		rc = qed_dbg_grc(cdev, (u8 *)buffer + offset +
>  				 REGDUMP_HEADER_SIZE, &feature_size);
>  		if (!rc) {
> +			module_firmware_crashed();
>  			*(u32 *)((u8 *)buffer + offset) =
>  			    qed_calc_regdump_header(cdev, GRC_DUMP,
>  						    cur_engine,


Hi Luis,

qed_dbg_all_data is being used to gather debug dump from device. Failures
inside it may happen due to various reasons, but they normally do not indicate
FW failure.

So I think its not a good place to insert this call.

Its hard to find exact good place to insert it in qed.

One more thing is that AFAIU taint flag gets permanent on kernel, but for
example our device can recover itself from some FW crashes, thus it'd be
transparent for user.

Whats the logical purpose of module_firmware_crashed? Does it mean fatal
unrecoverable error on device?

Thanks,
  Igor

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ