lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 27 May 2020 09:07:09 +0530
From:   Vasundhara Volam <vasundhara-v.volam@...adcom.com>
To:     Jiri Pirko <jiri@...nulli.us>
Cc:     David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>, Jiri Pirko <jiri@...lanox.com>,
        Michael Chan <michael.chan@...adcom.com>
Subject: Re: [PATCH v2 net-next 1/4] devlink: Add new "allow_fw_live_reset"
 generic device parameter.

On Tue, May 26, 2020 at 7:53 PM Vasundhara Volam
<vasundhara-v.volam@...adcom.com> wrote:
>
> On Tue, May 26, 2020 at 7:10 PM Jiri Pirko <jiri@...nulli.us> wrote:
> >
> > Tue, May 26, 2020 at 08:42:28AM CEST, vasundhara-v.volam@...adcom.com wrote:
> > >On Tue, May 26, 2020 at 10:17 AM Jiri Pirko <jiri@...nulli.us> wrote:
> > >>
> > >> Tue, May 26, 2020 at 06:28:59AM CEST, vasundhara-v.volam@...adcom.com wrote:
> > >> >On Mon, May 25, 2020 at 10:56 PM Jiri Pirko <jiri@...nulli.us> wrote:
> > >> >>
> > >> >> Sun, May 24, 2020 at 08:29:56AM CEST, vasundhara-v.volam@...adcom.com wrote:
> > >> >> >On Sun, May 24, 2020 at 10:23 AM Jiri Pirko <jiri@...nulli.us> wrote:
> > >> >> >>
> > >> >> >> Sat, May 23, 2020 at 08:08:22AM CEST, vasundhara-v.volam@...adcom.com wrote:
> > >> >> >> >Add a new "allow_fw_live_reset" generic device bool parameter. When
> > >> >> >> >parameter is set, user is allowed to reset the firmware in real time.
> > >> >> >> >
> > >> >> >> >This parameter is employed to communicate user consent or dissent for
> > >> >> >> >the live reset to happen. A separate command triggers the actual live
> > >> >> >> >reset.
> > >> >> >> >
> > >> >> >> >Cc: Jiri Pirko <jiri@...lanox.com>
> > >> >> >> >Signed-off-by: Vasundhara Volam <vasundhara-v.volam@...adcom.com>
> > >> >> >> >Signed-off-by: Michael Chan <michael.chan@...adcom.com>
> > >> >> >> >---
> > >> >> >> >v2: Rename param name to "allow_fw_live_reset" from
> > >> >> >> >"enable_hot_fw_reset".
> > >> >> >> >Update documentation for the param in devlink-params.rst file.
> > >> >> >> >---
> > >> >> >> > Documentation/networking/devlink/devlink-params.rst | 6 ++++++
> > >> >> >> > include/net/devlink.h                               | 4 ++++
> > >> >> >> > net/core/devlink.c                                  | 5 +++++
> > >> >> >> > 3 files changed, 15 insertions(+)
> > >> >> >> >
> > >> >> >> >diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
> > >> >> >> >index d075fd0..ad54dfb 100644
> > >> >> >> >--- a/Documentation/networking/devlink/devlink-params.rst
> > >> >> >> >+++ b/Documentation/networking/devlink/devlink-params.rst
> > >> >> >> >@@ -108,3 +108,9 @@ own name.
> > >> >> >> >    * - ``region_snapshot_enable``
> > >> >> >> >      - Boolean
> > >> >> >> >      - Enable capture of ``devlink-region`` snapshots.
> > >> >> >> >+   * - ``allow_fw_live_reset``
> > >> >> >> >+     - Boolean
> > >> >> >> >+     - Firmware live reset allows users to reset the firmware in real time.
> > >> >> >> >+       For example, after firmware upgrade, this feature can immediately reset
> > >> >> >> >+       to run the new firmware without reloading the driver or rebooting the
> > >> >> >>
> > >> >> >> This does not tell me anything about the reset being done on another
> > >> >> >> host. You need to emhasize that, in the name of the param too.
> > >> >> >I am not sure if I completely understand your query.
> > >> >> >
> > >> >> >Reset is actually initiated by one of the PF/host of the device, which
> > >> >> >resets the entire same device.
> > >> >> >
> > >> >> >Reset is not initiated by any other remote device/host.
> > >> >>
> > >> >> Well, in case of multihost system, it might be, right?
> > >> >>
> > >> >In case of multi-host system also, it is one of the host that triggers
> > >> >the reset, which resets the entire same device. I don't think this is
> > >> >remote.
> > >> >
> > >> >As the parameter is a device parameter, it is applicable to the entire
> > >> >device. When a user initiates the reset from any of the host in case
> > >> >of multi-host and any of the PF in case of stand-alone or smartNIC
> > >> >device, the entire device goes for a reset.
> > >> >
> > >> >I will be expanding the description to the following to make it more clear.
> > >> >
> > >> >------------------------
> > >> >- Firmware live reset allows users to reset the firmware in real time.
> > >> >For example, after firmware upgrade, this feature can immediately
> > >> >reset to run the new firmware without reloading the driver or
> > >> >rebooting the system.
> > >> >When a user initiates the reset from any of the host (in case of
> > >> >multi-host system) / PF (in case of stand-alone or smartNIC device),
> > >> >the entire device goes for a reset when the parameter is enabled.
> > >>
> > >> Sorry, this is still not clear. I think that you are mixing up two
> > >> different things:
> > >> 1) option of devlink reload to indicate that user is interested in "live
> > >>    reset" of firmware without reloading driver
> > >
> > >This is the option we are trying to add. If a user is interested in
> > >"live reset", he needs to enable the parameter to enable it in device
> > >capabilities, which is achieved by permanent configuration mode. When
> > >capability is enabled in the device, new firmware which is aware will
> > >allocate the resources and exposes the capability to host drivers.
> > >
> > >But firmware allows the "live reset" only when all the loaded drivers
> > >are aware of/supports the capability. For example, if any of the host
> > >is loaded with an old driver, "live reset" is not allowed until the
> > >driver is upgraded or unloaded. or if the host driver turns it off,
> > >then also "live reset" is not allowed.
> > >
> > >In case of runtime parameter cmode, if any of the host turns off the
> > >capability in the host driver, "live reset" is not allowed until the
> > >driver is unloaded or the user enables it again.
> > >
> > >To make it clear, I can add two parameters.
> > >
> > >1. enable_fw_live_reset - To indicate that the user is interested in
> > >"live reset". This will be a generic param.
> >
> > As I wrote above, I believe this should be an option
> > to "devlink dev reload", not a param.
> I think you are still confused with enabling feature in NVRAM
> configuration of the device and command to trigger reset. This param
> will enable the feature in the device NVRAM configuration and does not
> trigger the actual reset.
>
> Only when the param is set, feature will be enabled in the device and
> firmware supports the "live reset". When the param is disabled,
> firmware cannot support "live reset" and user needs to do PCIe reset
> after flashing the firmware for it to take effect..
>
> Once feature is enabled in NVRAM configuration, it will be persistent
> across reboots.
>
> User still needs to use "devlink dev reload" command to do the "live reset".
Here is a sample sequence of commands to do a "live reset" to get some
clear idea.
Note that I am providing the examples based on the current patchset.

1. FW live reset is disabled in the device/adapter. Here adapter has 2
physical ports.

$ devlink dev
pci/0000:3b:00.0
pci/0000:3b:00.1
pci/0000:af:00.0
$ devlink dev param show pci/0000:3b:00.0 name allow_fw_live_reset
pci/0000:3b:00.0:
  name allow_fw_live_reset type generic
    values:
      cmode runtime value false
      cmode permanent value false
$ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
pci/0000:3b:00.1:
  name allow_fw_live_reset type generic
    values:
      cmode runtime value false
      cmode permanent value false

2. If a user issues "ethtool --reset p1p1 all", the device cannot
perform "live reset" as capability is not enabled. User needs to do a
driver reload, for firmware to undergo reset.

$ ethtool --reset p1p1 all
ETHTOOL_RESET 0xffffffff
Components reset:     0xff0000
Components not reset: 0xff00ffff
$ dmesg
[  198.745822] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
[  198.745836] bnxt_en 0000:3b:00.0 p1p1: Reload driver to complete reset

3. Now enable the capability in the device and reboot for device to
enable the capability. Firmware does not get reset just by setting the
param to true.

$ devlink dev param set pci/0000:3b:00.1 name allow_fw_live_reset
value true cmode permanent

4. After reboot, values of param.

$ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
pci/0000:3b:00.1:
  name allow_fw_live_reset type generic
    values:
      cmode runtime value true
      cmode permanent value true
$ devlink dev param show pci/0000:3b:00.0 name allow_fw_live_reset
pci/0000:3b:00.0:
  name allow_fw_live_reset type generic
    values:
      cmode runtime value true
      cmode permanent value true

5. Now issue the "ethtool --reset p1p1 all" and device will undergo
the "live reset". Reloading the driver is not required.

$ ethtool --reset p1p1 all
ETHTOOL_RESET 0xffffffff
Components reset:     0xff0000
Components not reset: 0xff00ffff
$ dmesg
[  117.432013] bnxt_en 0000:3b:00.0 p1p1: Firmware non-fatal reset
event received, max wait time 4200 msec
[  117.432015] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
[  117.432032] bnxt_en 0000:3b:00.1 p1p2: Firmware non-fatal reset
event received, max wait time 4200 msec
$ devlink health show pci/0000:3b:00.0 reporter fw_reset
pci/0000:3b:00.0:
  reporter fw_reset
    state healthy error 1 recover 1 grace_period 0 auto_recover true

6. If one of the host/PF turns off runtime param to false, "ethtool
--reset p1p1 all" behaves similar to step 2, until it turns it back
on.

$ devlink dev param set pci/0000:3b:00.1 name allow_fw_live_reset
value false cmode runtime
$ ethtool --reset p1p1 all
ETHTOOL_RESET 0xffffffff
Components reset:     0xff0000
Components not reset: 0xff00ffff
$ dmesg
[  327.610814] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
[  327.610828] bnxt_en 0000:3b:00.0 p1p1: Reload driver to complete reset

Thanks.
> >
> >
> > >
> > >2. allow_fw_live_reset - To indicate, if any of the host/PF turns it
> > >off, "live reset" is not allowed. This serves the purpose of what we
> > >are trying to add in runtime cmode.
> >
> > Yeah.
> And this param will enable the feature in the driver for driver to
> allow the firmware to go for "live reset", where as above param will
> enable the feature in NVRAM configuration of the device.
> >
> > >Do you want me to keep it as a driver-specific param?
> >
> > There is nothing driver-specific about this.
> okay.
> >
> >
> > >
> > >Please let me know if this is clear and makes less confusion.
> > >
> > >Thanks,
> > >Vasundhara
> > >
> > >> 2) devlink param that would indicate "I am okay if someone else (not by
> > >>    my devlink instance) resets my firmware".
> > >>
> > >> Could you please split?
> > >>
> > >>
> > >> >------------------------
> > >> >
> > >> >Thanks,
> > >> >Vasundhara
> > >> >>
> > >> >> >
> > >> >> >Thanks,
> > >> >> >Vasundhara
> > >> >> >>
> > >> >> >>
> > >> >> >>
> > >> >> >> >+       system.
> > >> >> >> >diff --git a/include/net/devlink.h b/include/net/devlink.h
> > >> >> >> >index 8ffc1b5c..488b61c 100644
> > >> >> >> >--- a/include/net/devlink.h
> > >> >> >> >+++ b/include/net/devlink.h
> > >> >> >> >@@ -406,6 +406,7 @@ enum devlink_param_generic_id {
> > >> >> >> >       DEVLINK_PARAM_GENERIC_ID_FW_LOAD_POLICY,
> > >> >> >> >       DEVLINK_PARAM_GENERIC_ID_RESET_DEV_ON_DRV_PROBE,
> > >> >> >> >       DEVLINK_PARAM_GENERIC_ID_ENABLE_ROCE,
> > >> >> >> >+      DEVLINK_PARAM_GENERIC_ID_ALLOW_FW_LIVE_RESET,
> > >> >> >> >
> > >> >> >> >       /* add new param generic ids above here*/
> > >> >> >> >       __DEVLINK_PARAM_GENERIC_ID_MAX,
> > >> >> >> >@@ -443,6 +444,9 @@ enum devlink_param_generic_id {
> > >> >> >> > #define DEVLINK_PARAM_GENERIC_ENABLE_ROCE_NAME "enable_roce"
> > >> >> >> > #define DEVLINK_PARAM_GENERIC_ENABLE_ROCE_TYPE DEVLINK_PARAM_TYPE_BOOL
> > >> >> >> >
> > >> >> >> >+#define DEVLINK_PARAM_GENERIC_ALLOW_FW_LIVE_RESET_NAME "allow_fw_live_reset"
> > >> >> >> >+#define DEVLINK_PARAM_GENERIC_ALLOW_FW_LIVE_RESET_TYPE DEVLINK_PARAM_TYPE_BOOL
> > >> >> >> >+
> > >> >> >> > #define DEVLINK_PARAM_GENERIC(_id, _cmodes, _get, _set, _validate)    \
> > >> >> >> > {                                                                     \
> > >> >> >> >       .id = DEVLINK_PARAM_GENERIC_ID_##_id,                           \
> > >> >> >> >diff --git a/net/core/devlink.c b/net/core/devlink.c
> > >> >> >> >index 7b76e5f..8544f23 100644
> > >> >> >> >--- a/net/core/devlink.c
> > >> >> >> >+++ b/net/core/devlink.c
> > >> >> >> >@@ -3011,6 +3011,11 @@ static int devlink_nl_cmd_flash_update(struct sk_buff *skb,
> > >> >> >> >               .name = DEVLINK_PARAM_GENERIC_ENABLE_ROCE_NAME,
> > >> >> >> >               .type = DEVLINK_PARAM_GENERIC_ENABLE_ROCE_TYPE,
> > >> >> >> >       },
> > >> >> >> >+      {
> > >> >> >> >+              .id = DEVLINK_PARAM_GENERIC_ID_ALLOW_FW_LIVE_RESET,
> > >> >> >> >+              .name = DEVLINK_PARAM_GENERIC_ALLOW_FW_LIVE_RESET_NAME,
> > >> >> >> >+              .type = DEVLINK_PARAM_GENERIC_ALLOW_FW_LIVE_RESET_TYPE,
> > >> >> >> >+      },
> > >> >> >> > };
> > >> >> >> >
> > >> >> >> > static int devlink_param_generic_verify(const struct devlink_param *param)
> > >> >> >> >--
> > >> >> >> >1.8.3.1
> > >> >> >> >

Powered by blists - more mailing lists