[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAACQVJqs9=PJ5UBrW9R9UmVYX1jqkJvZWj3j6FmVB9S5mOn+mg@mail.gmail.com>
Date: Thu, 28 May 2020 07:20:00 +0530
From: Vasundhara Volam <vasundhara-v.volam@...adcom.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Michael Chan <michael.chan@...adcom.com>,
Jiri Pirko <jiri@...nulli.us>,
David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>, Jiri Pirko <jiri@...lanox.com>
Subject: Re: [PATCH v2 net-next 1/4] devlink: Add new "allow_fw_live_reset"
generic device parameter.
On Thu, May 28, 2020 at 2:46 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Wed, 27 May 2020 13:57:11 -0700 Michael Chan wrote:
> > On Wed, May 27, 2020 at 1:14 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > > On Wed, 27 May 2020 09:07:09 +0530 Vasundhara Volam wrote:
> > > > Here is a sample sequence of commands to do a "live reset" to get some
> > > > clear idea.
> > > > Note that I am providing the examples based on the current patchset.
> > > >
> > > > 1. FW live reset is disabled in the device/adapter. Here adapter has 2
> > > > physical ports.
> > > >
> > > > $ devlink dev
> > > > pci/0000:3b:00.0
> > > > pci/0000:3b:00.1
> > > > pci/0000:af:00.0
> > > > $ devlink dev param show pci/0000:3b:00.0 name allow_fw_live_reset
> > > > pci/0000:3b:00.0:
> > > > name allow_fw_live_reset type generic
> > > > values:
> > > > cmode runtime value false
> > > > cmode permanent value false
> > > > $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
> > > > pci/0000:3b:00.1:
> > > > name allow_fw_live_reset type generic
> > > > values:
> > > > cmode runtime value false
> > > > cmode permanent value false
> > >
> > > What's the permanent value? What if after reboot the driver is too old
> > > to change this, is the reset still allowed?
> >
> > The permanent value should be the NVRAM value. If the NVRAM value is
> > false, the feature is always and unconditionally disabled. If the
> > permanent value is true, the feature will only be available when all
> > loaded drivers indicate support for it and set the runtime value to
> > true. If an old driver is loaded afterwards, it wouldn't indicate
> > support for this feature and it wouldn't set the runtime value to
> > true. So the feature will not be available until the old driver is
> > unloaded or upgraded.
>
> Setting this permanent value to false makes the FW's life easier?
It just disables the feature.
> Otherwise why not always have it enabled and just depend on hosts
> not opting in?
We are providing permanent value as a flexibility to user. We can
remove it, if it makes things easy and clear.
>
> > > > 2. If a user issues "ethtool --reset p1p1 all", the device cannot
> > > > perform "live reset" as capability is not enabled.
> > > >
> > > > User needs to do a driver reload, for firmware to undergo reset.
> > >
> > > Why does driver reload have anything to do with resetting a potentially
> > > MH device?
> >
> > I think she meant that all drivers have to be unloaded before the
> > reset would take place in case it's a MH device since live reset is
> > not supported. If it's a single function device, unloading this
> > driver is sufficient.
yes.
>
> I see.
>
> > > > $ ethtool --reset p1p1 all
> > >
> > > Reset probably needs to be done via devlink. In any case you need a new
> > > reset level for resetting MH devices and smartnics, because the current
> > > reset mask covers port local, and host local cases, not any form of MH.
> >
> > RIght. This reset could be just a single function reset in this example.
>
> Well, for the single host scenario the parameter dance is not at all
> needed, since there is only one domain of control. If user can issue a
> reset they can as well change the value of the param or even reload the
> driver. The runtime parameter only makes sense in MH/SmartNIC scenario,
> so IMHO the param and devlink reset are strongly dependent.
>
> > > > ETHTOOL_RESET 0xffffffff
> > > > Components reset: 0xff0000
> > > > Components not reset: 0xff00ffff
> > > > $ dmesg
> > > > [ 198.745822] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
> > > > [ 198.745836] bnxt_en 0000:3b:00.0 p1p1: Reload driver to complete reset
> > >
> > > You said the reset was not performed, yet there is no information to
> > > that effect in the log?!
> >
> > The firmware has been requested to reset, but the reset hasn't taken
> > place yet because live reset cannot be done. We can make the logs
> > more clear.
>
> Thanks
>
> > > > 3. Now enable the capability in the device and reboot for device to
> > > > enable the capability. Firmware does not get reset just by setting the
> > > > param to true.
> > > >
> > > > $ devlink dev param set pci/0000:3b:00.1 name allow_fw_live_reset
> > > > value true cmode permanent
> > > >
> > > > 4. After reboot, values of param.
> > >
> > > Is the reboot required here?
> >
> > In general, our new NVRAM permanent parameters will take effect after
> > reset (or reboot).
> >
> > > > $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
> > > > pci/0000:3b:00.1:
> > > > name allow_fw_live_reset type generic
> > > > values:
> > > > cmode runtime value true
> > >
> > > Why is runtime value true now?
> > >
> >
> > If the permanent (NVRAM) parameter is true, all loaded new drivers
> > will indicate support for this feature and set the runtime value to
> > true by default. The runtime value would not be true if any loaded
> > driver is too old or has set the runtime value to false.
>
> Okay, the parameter has a bit of a dual role as it controls whether the
> feature is available (false -> true transition requiring a reset/reboot)
> and the default setting of the runtime parameter. Let's document that
> more clearly.
Please look at the 3/4 patch for more documentation in the bnxt.rst
file. We can add more documentation, if needed, in the bnxt.rst file.
Thanks.
Powered by blists - more mailing lists