lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6bd0fa45-68ce-b82d-98e6-327c6cd50e80@nvidia.com>
Date:   Mon, 7 Sep 2020 16:46:01 +0300
From:   Moshe Shemesh <moshe@...dia.com>
To:     Jakub Kicinski <kuba@...nel.org>, Jiri Pirko <jiri@...nulli.us>
CC:     Moshe Shemesh <moshe@...lanox.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jiri Pirko <jiri@...lanox.com>, <netdev@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next RFC v3 01/14] devlink: Add reload action option
 to devlink reload command


On 9/4/2020 10:56 PM, Jakub Kicinski wrote:
> External email: Use caution opening links or attachments
>
>
> On Fri, 4 Sep 2020 11:04:50 +0200 Jiri Pirko wrote:
>> Thu, Sep 03, 2020 at 09:47:19PM CEST, kuba@...nel.org wrote:
>>> On Thu, 3 Sep 2020 07:57:29 +0200 Jiri Pirko wrote:
>>>> Wed, Sep 02, 2020 at 05:30:25PM CEST, kuba@...nel.org wrote:
>>>>> On Wed, 2 Sep 2020 11:46:27 +0200 Jiri Pirko wrote:
>>>>>>> ? Do we need such change there too or keep it as is, each action by itself
>>>>>>> and return what was performed ?
>>>>>> Well, I don't know. User asks for X, X should be performed, not Y or Z.
>>>>>> So perhaps the return value is not needed.
>>>>>> Just driver advertizes it supports X, Y, Z and the users says:
>>>>>> 1) do X, driver does X
>>>>>> 2) do Y, driver does Y
>>>>>> 3) do Z, driver does Z
>>>>>> [
>>>>>> I think this kindof circles back to the original proposal...
>>>>> Why? User does not care if you activate new devlink params when
>>>>> activating new firmware. Trust me. So why make the user figure out
>>>>> which of all possible reset option they should select? If there is
>>>>> a legitimate use case to limit what is reset - it should be handled
>>>>> by a separate negative attribute, like --live which says don't reset
>>>>> anything.
>>>> I see. Okay. Could you please sum-up the interface as you propose it?
>>> What I proposed on v1, pass requested actions as a bitfield, driver may
>>> perform more actions, we can return performed actions in the response.
>> Okay. So for example for mlxsw, user might say:
>> 1) I want driver reinit
>>      kernel reports: fw reset and driver reinit was done
>> 2) I want fw reset
>>      kernel reports: fw reset and driver reinit was done
>> 3) I want fw reset and driver reinit
>>      kernel reports: fw reset and driver reinit was done
> Yup.
>
>>> Then separate attribute to carry constraints for the request, like
>>> --live.
>> Hmm, this is a bit unclear how it is supposed to work. The constraints
>> apply for all? I mean, the actions are requested by a bitfield.
>> So the user can say:
>> I want fw reset and driver reinit --live. "--live" applies to both fw
>> reset and driver reinit? That is odd.
> The way I was thinking about it - the constraint expresses what sort of
> downtime the user can accept. So yes, it'd apply to all. If any of the
> reset actions does not meet the constraint then error should be
> returned.
>
> In that sense I don't like --live because it doesn't really say much.
> AFAIU it means 1) no link flap; 2) < 2 sec datapath downtime; 3) no
> configuration is lost in kernel or device (including netdev config,
> link config, flow rules, counters etc.). I was hoping at least the
> documentation in patch 14 would be more precise.


Actually, while writing "no-reset" or "live-patching" I meant also no 
downtime at all and nothing resets (config, rules ... anything), that 
fits mlx5 live-patching.

However, to make it more generic,  I can allow few seconds downtime and 
add similar constrains as you mentioned here to "no-reset". I will add 
that to the documentation patch.

> I think you're saying that it's strange to express that as a constraint
> because internally it maps to one of two fw reset types. And there is
> only one driver reinit procedure. But I don't think that the
> distinction of reset types is valuable to the user. What matters is if
> application SLAs will be met or not.
>
> I assume that deeper/longer reset is always less risky and would be
> preferred unless constraint is specified.
>
>>> I'd think the supported actions in devlink_ops would be fine as a
>>> bitfield, too. Combinations are often hard to capture in static data.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ