[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200601100511.GM2282@nanopsycho>
Date: Mon, 1 Jun 2020 12:05:11 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Vasundhara Volam <vasundhara-v.volam@...adcom.com>
Cc: Michael Chan <michael.chan@...adcom.com>,
Jakub Kicinski <kuba@...nel.org>,
David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>, Jiri Pirko <jiri@...lanox.com>
Subject: Re: [PATCH v2 net-next 1/4] devlink: Add new "allow_fw_live_reset"
generic device parameter.
Mon, Jun 01, 2020 at 11:57:28AM CEST, vasundhara-v.volam@...adcom.com wrote:
>On Mon, Jun 1, 2020 at 3:19 PM Jiri Pirko <jiri@...nulli.us> wrote:
>>
>> Mon, Jun 01, 2020 at 10:50:50AM CEST, vasundhara-v.volam@...adcom.com wrote:
>> >On Mon, Jun 1, 2020 at 12:09 PM Jiri Pirko <jiri@...nulli.us> wrote:
>> >>
>> >> Wed, May 27, 2020 at 10:57:11PM CEST, michael.chan@...adcom.com wrote:
>> >> >On Wed, May 27, 2020 at 1:14 PM Jakub Kicinski <kuba@...nel.org> wrote:
>> >> >>
>> >> >> On Wed, 27 May 2020 09:07:09 +0530 Vasundhara Volam wrote:
>> >> >> > Here is a sample sequence of commands to do a "live reset" to get some
>> >> >> > clear idea.
>> >> >> > Note that I am providing the examples based on the current patchset.
>> >> >> >
>> >> >> > 1. FW live reset is disabled in the device/adapter. Here adapter has 2
>> >> >> > physical ports.
>> >> >> >
>> >> >> > $ devlink dev
>> >> >> > pci/0000:3b:00.0
>> >> >> > pci/0000:3b:00.1
>> >> >> > pci/0000:af:00.0
>> >> >> > $ devlink dev param show pci/0000:3b:00.0 name allow_fw_live_reset
>> >> >> > pci/0000:3b:00.0:
>> >> >> > name allow_fw_live_reset type generic
>> >> >> > values:
>> >> >> > cmode runtime value false
>> >> >> > cmode permanent value false
>> >> >> > $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
>> >> >> > pci/0000:3b:00.1:
>> >> >> > name allow_fw_live_reset type generic
>> >> >> > values:
>> >> >> > cmode runtime value false
>> >> >> > cmode permanent value false
>> >> >>
>> >> >> What's the permanent value? What if after reboot the driver is too old
>> >> >> to change this, is the reset still allowed?
>> >> >
>> >> >The permanent value should be the NVRAM value. If the NVRAM value is
>> >> >false, the feature is always and unconditionally disabled. If the
>> >> >permanent value is true, the feature will only be available when all
>> >> >loaded drivers indicate support for it and set the runtime value to
>> >> >true. If an old driver is loaded afterwards, it wouldn't indicate
>> >> >support for this feature and it wouldn't set the runtime value to
>> >> >true. So the feature will not be available until the old driver is
>> >> >unloaded or upgraded.
>> >> >
>> >> >>
>> >> >> > 2. If a user issues "ethtool --reset p1p1 all", the device cannot
>> >> >> > perform "live reset" as capability is not enabled.
>> >> >> >
>> >> >> > User needs to do a driver reload, for firmware to undergo reset.
>> >> >>
>> >> >> Why does driver reload have anything to do with resetting a potentially
>> >> >> MH device?
>> >> >
>> >> >I think she meant that all drivers have to be unloaded before the
>> >> >reset would take place in case it's a MH device since live reset is
>> >> >not supported. If it's a single function device, unloading this
>> >> >driver is sufficient.
>> >> >
>> >> >>
>> >> >> > $ ethtool --reset p1p1 all
>> >> >>
>> >> >> Reset probably needs to be done via devlink. In any case you need a new
>> >> >> reset level for resetting MH devices and smartnics, because the current
>> >> >> reset mask covers port local, and host local cases, not any form of MH.
>> >> >
>> >> >RIght. This reset could be just a single function reset in this example.
>> >> >
>> >> >>
>> >> >> > ETHTOOL_RESET 0xffffffff
>> >> >> > Components reset: 0xff0000
>> >> >> > Components not reset: 0xff00ffff
>> >> >> > $ dmesg
>> >> >> > [ 198.745822] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request successful.
>> >> >> > [ 198.745836] bnxt_en 0000:3b:00.0 p1p1: Reload driver to complete reset
>> >> >>
>> >> >> You said the reset was not performed, yet there is no information to
>> >> >> that effect in the log?!
>> >> >
>> >> >The firmware has been requested to reset, but the reset hasn't taken
>> >> >place yet because live reset cannot be done. We can make the logs
>> >> >more clear.
>> >> >
>> >> >>
>> >> >> > 3. Now enable the capability in the device and reboot for device to
>> >> >> > enable the capability. Firmware does not get reset just by setting the
>> >> >> > param to true.
>> >> >> >
>> >> >> > $ devlink dev param set pci/0000:3b:00.1 name allow_fw_live_reset
>> >> >> > value true cmode permanent
>> >> >> >
>> >> >> > 4. After reboot, values of param.
>> >> >>
>> >> >> Is the reboot required here?
>> >> >>
>> >> >
>> >> >In general, our new NVRAM permanent parameters will take effect after
>> >> >reset (or reboot).
>> >>
>> >> Ah, you need a reboot. I was not expecting that :/ So the "devlink dev
>> >> reload" attr would not work for you. MLNX hardware can change this on
>> >> runtime.
>> >
>> >NVRAM parameter configuration will take effect only on reboot or on
>> >"live reset" (except few). But to enable "live reset", system needs a
>> >reboot.
>> >>
>> >>
>> >> >
>> >> >> > $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset
>> >> >> > pci/0000:3b:00.1:
>> >> >> > name allow_fw_live_reset type generic
>> >> >> > values:
>> >> >> > cmode runtime value true
>> >> >>
>> >> >> Why is runtime value true now?
>> >> >>
>> >> >
>> >> >If the permanent (NVRAM) parameter is true, all loaded new drivers
>> >> >will indicate support for this feature and set the runtime value to
>> >> >true by default. The runtime value would not be true if any loaded
>> >> >driver is too old or has set the runtime value to false.
>> >>
>> >> This is a bit odd. It is a configuration, not an indication. When you
>> >> want to indicate what you support something, I think it should be done
>> >> in a different place. I think that "devlink dev info" is the place to
>> >> put it, I think that we need "capabilities" there.
>> >>
>> >Indication can be shown in 'devlink dev info', but users can configure
>> >this parameter also to control the 'live reset' at runtime.
>>
>> I'm totally confused. You wrote above that you need reboot (cmode
>> permanent) for that. What is the usecase for cmode runtime?
>>
>
>Okay let me brief it completely, probably some repetition.
>
>Permanent cmode, is to enable the "live reset" feature in the device,
>which takes effect after reboot, as it is configuration the parameter
>in NVRAM configuration. After reboot, if the parameter is enabled,
>user can initiate the live reset from a separate command.
>
>Runtime cmode, default value is true and available only when permanent
>cmode is true. This allows the user to temporarily disallow the "live
>reset", if the host is running any mission critical application.
Okay. I understand now. Thanks.
Powered by blists - more mailing lists