[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200527131401.2e269ab8@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net>
Date: Wed, 27 May 2020 13:14:01 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Vasundhara Volam <vasundhara-v.volam@...adcom.com>
Cc: Jiri Pirko <jiri@...nulli.us>, David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>, Jiri Pirko <jiri@...lanox.com>,
Michael Chan <michael.chan@...adcom.com>
Subject: Re: [PATCH v2 net-next 1/4] devlink: Add new "allow_fw_live_reset"
generic device parameter.
On Wed, 27 May 2020 09:07:09 +0530 Vasundhara Volam wrote:
> Here is a sample sequence of commands to do a "live reset" to get some
> clear idea.
> Note that I am providing the examples based on the current patchset.
>
> 1. FW live reset is disabled in the device/adapter. Here adapter has 2
> physical ports.
>
> $ devlink dev
> pci/0000:3b:00.0
> pci/0000:3b:00.1
> pci/0000:af:00.0
> $ devlink dev param show pci/0000:3b:00.0 name allow_fw_live_reset
> pci/0000:3b:00.0:
> name allow_fw_live_reset type generic
> values:
> cmode runtime value false
> cmode permanent value false
> $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
> pci/0000:3b:00.1:
> name allow_fw_live_reset type generic
> values:
> cmode runtime value false
> cmode permanent value false
What's the permanent value? What if after reboot the driver is too old
to change this, is the reset still allowed?
> 2. If a user issues "ethtool --reset p1p1 all", the device cannot
> perform "live reset" as capability is not enabled.
>
> User needs to do a driver reload, for firmware to undergo reset.
Why does driver reload have anything to do with resetting a potentially
MH device?
> $ ethtool --reset p1p1 all
Reset probably needs to be done via devlink. In any case you need a new
reset level for resetting MH devices and smartnics, because the current
reset mask covers port local, and host local cases, not any form of MH.
> ETHTOOL_RESET 0xffffffff
> Components reset: 0xff0000
> Components not reset: 0xff00ffff
> $ dmesg
> [ 198.745822] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
> [ 198.745836] bnxt_en 0000:3b:00.0 p1p1: Reload driver to complete reset
You said the reset was not performed, yet there is no information to
that effect in the log?!
> 3. Now enable the capability in the device and reboot for device to
> enable the capability. Firmware does not get reset just by setting the
> param to true.
>
> $ devlink dev param set pci/0000:3b:00.1 name allow_fw_live_reset
> value true cmode permanent
>
> 4. After reboot, values of param.
Is the reboot required here?
> $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
> pci/0000:3b:00.1:
> name allow_fw_live_reset type generic
> values:
> cmode runtime value true
Why is runtime value true now?
> cmode permanent value true
> $ devlink dev param show pci/0000:3b:00.0 name allow_fw_live_reset
> pci/0000:3b:00.0:
> name allow_fw_live_reset type generic
> values:
> cmode runtime value true
> cmode permanent value true
>
> 5. Now issue the "ethtool --reset p1p1 all" and device will undergo
> the "live reset". Reloading the driver is not required.
>
> $ ethtool --reset p1p1 all
> ETHTOOL_RESET 0xffffffff
> Components reset: 0xff0000
> Components not reset: 0xff00ffff
> $ dmesg
> [ 117.432013] bnxt_en 0000:3b:00.0 p1p1: Firmware non-fatal reset
> event received, max wait time 4200 msec
> [ 117.432015] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
> [ 117.432032] bnxt_en 0000:3b:00.1 p1p2: Firmware non-fatal reset
> event received, max wait time 4200 msec
> $ devlink health show pci/0000:3b:00.0 reporter fw_reset
> pci/0000:3b:00.0:
> reporter fw_reset
> state healthy error 1 recover 1 grace_period 0 auto_recover true
>
> 6. If one of the host/PF turns off runtime param to false, "ethtool
> --reset p1p1 all" behaves similar to step 2, until it turns it back
> on.
>
> $ devlink dev param set pci/0000:3b:00.1 name allow_fw_live_reset
> value false cmode runtime
> $ ethtool --reset p1p1 all
> ETHTOOL_RESET 0xffffffff
> Components reset: 0xff0000
> Components not reset: 0xff00ffff
> $ dmesg
> [ 327.610814] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
> [ 327.610828] bnxt_en 0000:3b:00.0 p1p1: Reload driver to complete reset
Powered by blists - more mailing lists