lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2cd57697-a1e1-cab8-6a7d-f139b5af1420@nvidia.com>
Date:   Wed, 7 Oct 2020 08:41:28 +0300
From:   Moshe Shemesh <moshe@...dia.com>
To:     Jacob Keller <jacob.e.keller@...el.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Moshe Shemesh <moshe@...lanox.com>
CC:     "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Jiri Pirko <jiri@...dia.com>,
        <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next 05/16] devlink: Add remote reload stats


On 10/5/2020 10:12 PM, Jacob Keller wrote:
>
> On 10/4/2020 12:09 AM, Moshe Shemesh wrote:
>> On 10/3/2020 12:05 PM, Jiri Pirko wrote:
>>> Thu, Oct 01, 2020 at 03:59:08PM CEST, moshe@...lanox.com wrote:
>>>> Add remote reload stats to hold the history of actions performed due
>>>> devlink reload commands initiated by remote host. For example, in case
>>>> firmware activation with reset finished successfully but was initiated
>>>> by remote host.
>>>>
>>>> The function devlink_remote_reload_actions_performed() is exported to
>>>> enable drivers update on remote reload actions performed as it was not
>>>> initiated by their own devlink instance.
>>>>
>>>> Expose devlink remote reload stats to the user through devlink dev get
>>>> command.
>>>>
>>>> Examples:
>>>> $ devlink dev show
>>>> pci/0000:82:00.0:
>>>>    stats:
>>>>        reload_stats:
>>>>          driver_reinit 2
>>>>          fw_activate 1
>>>>          fw_activate_no_reset 0
>>>>        remote_reload_stats:
>>>>          driver_reinit 0
>>>>          fw_activate 0
>>>>          fw_activate_no_reset 0
>>>> pci/0000:82:00.1:
>>>>    stats:
>>>>        reload_stats:
>>>>          driver_reinit 1
>>>>          fw_activate 0
>>>>          fw_activate_no_reset 0
>>>>        remote_reload_stats:
>>>>          driver_reinit 1
>>>>          fw_activate 1
>>>>          fw_activate_no_reset 0
>>>>
>>>> $ devlink dev show -jp
>>>> {
>>>>      "dev": {
>>>>          "pci/0000:82:00.0": {
>>>>              "stats": {
>>>>                  "reload_stats": [ {
>>>>                          "driver_reinit": 2
>>>>                      },{
>>>>                          "fw_activate": 1
>>>>                      },{
>>>>                          "fw_activate_no_reset": 0
>>>>                      } ],
>>>>                  "remote_reload_stats": [ {
>>>>                          "driver_reinit": 0
>>>>                      },{
>>>>                          "fw_activate": 0
>>>>                      },{
>>>>                          "fw_activate_no_reset": 0
>>>>                      } ]
>>>>              }
>>>>          },
>>>>          "pci/0000:82:00.1": {
>>>>              "stats": {
>>>>                  "reload_stats": [ {
>>>>                          "driver_reinit": 1
>>>>                      },{
>>>>                          "fw_activate": 0
>>>>                      },{
>>>>                          "fw_activate_no_reset": 0
>>>>                      } ],
>>>>                  "remote_reload_stats": [ {
>>>>                          "driver_reinit": 1
>>>>                      },{
>>>>                          "fw_activate": 1
>>>>                      },{
>>>>                          "fw_activate_no_reset": 0
>>>>                      } ]
>>>>              }
>>>>          }
>>>>      }
>>>> }
>>>>
>>>> Signed-off-by: Moshe Shemesh <moshe@...lanox.com>
>>>> ---
>>>> RFCv5 -> v1:
>>>> - Resplit this patch and the previous one by remote/local reload stats
>>>> instead of set/get reload stats
>>>> - Rename reload_action_stats to reload_stats
>>>> RFCv4 -> RFCv5:
>>>> - Add remote actions stats
>>>> - If devlink reload is not supported, show only remote_stats
>>>> RFCv3 -> RFCv4:
>>>> - Renamed DEVLINK_ATTR_RELOAD_ACTION_CNT to
>>>>    DEVLINK_ATTR_RELOAD_ACTION_STAT
>>>> - Add stats per action per limit level
>>>> RFCv2 -> RFCv3:
>>>> - Add reload actions counters instead of supported reload actions
>>>>    (reload actions counters are only for supported action so no need for
>>>>     both)
>>>> RFCv1 -> RFCv2:
>>>> - Removed DEVLINK_ATTR_RELOAD_DEFAULT_LEVEL
>>>> - Removed DEVLINK_ATTR_RELOAD_LEVELS_INFO
>>>> - Have actions instead of levels
>>>> ---
>>>> include/net/devlink.h        |  1 +
>>>> include/uapi/linux/devlink.h |  1 +
>>>> net/core/devlink.c           | 49 +++++++++++++++++++++++++++++++-----
>>>> 3 files changed, 45 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/include/net/devlink.h b/include/net/devlink.h
>>>> index 0f3bd23b6c04..a4ccb83bbd2c 100644
>>>> --- a/include/net/devlink.h
>>>> +++ b/include/net/devlink.h
>>>> @@ -42,6 +42,7 @@ struct devlink {
>>>>      const struct devlink_ops *ops;
>>>>      struct xarray snapshot_ids;
>>>>      u32 reload_stats[DEVLINK_RELOAD_STATS_ARRAY_SIZE];
>>>> +   u32 remote_reload_stats[DEVLINK_RELOAD_STATS_ARRAY_SIZE];
>>> Perhaps a nested struct  {} stats?
>> I guess you mean struct that holds these two arrays.
>>>>      struct device *dev;
>>>>      possible_net_t _net;
>>>>      struct mutex lock; /* Serializes access to devlink instance specific objects such as
>>>> diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
>>>> index 97e0137f6201..f9887d8afdc7 100644
>>>> --- a/include/uapi/linux/devlink.h
>>>> +++ b/include/uapi/linux/devlink.h
>>>> @@ -530,6 +530,7 @@ enum devlink_attr {
>>>>      DEVLINK_ATTR_RELOAD_STATS,              /* nested */
>>>>      DEVLINK_ATTR_RELOAD_STATS_ENTRY,        /* nested */
>>>>      DEVLINK_ATTR_RELOAD_STATS_VALUE,        /* u32 */
>>>> +   DEVLINK_ATTR_REMOTE_RELOAD_STATS,       /* nested */
>>>>
>>>>      /* add new attributes above here, update the policy in devlink.c */
>>>>
>>>> diff --git a/net/core/devlink.c b/net/core/devlink.c
>>>> index 05516f1e4c3e..3b6bd3b4d346 100644
>>>> --- a/net/core/devlink.c
>>>> +++ b/net/core/devlink.c
>>>> @@ -523,28 +523,35 @@ static int devlink_reload_stat_put(struct sk_buff *msg, enum devlink_reload_acti
>>>>      return -EMSGSIZE;
>>>> }
>>>>
>>>> -static int devlink_reload_stats_put(struct sk_buff *msg, struct devlink *devlink)
>>>> +static int devlink_reload_stats_put(struct sk_buff *msg, struct devlink *devlink, bool is_remote)
>>>> {
>>>>      struct nlattr *reload_stats_attr;
>>>>      int i, j, stat_idx;
>>>>      u32 value;
>>>>
>>>> -   reload_stats_attr = nla_nest_start(msg, DEVLINK_ATTR_RELOAD_STATS);
>>>> +   if (!is_remote)
>>>> +           reload_stats_attr = nla_nest_start(msg, DEVLINK_ATTR_RELOAD_STATS);
>>>> +   else
>>>> +           reload_stats_attr = nla_nest_start(msg, DEVLINK_ATTR_REMOTE_RELOAD_STATS);
>>>>
>>>>      if (!reload_stats_attr)
>>>>              return -EMSGSIZE;
>>>>
>>>>      for (j = 0; j <= DEVLINK_RELOAD_LIMIT_MAX; j++) {
>>>> -           if (j != DEVLINK_RELOAD_LIMIT_UNSPEC &&
>>>> +           if (!is_remote && j != DEVLINK_RELOAD_LIMIT_UNSPEC &&
>>> I don't follow the check "!is_remote" here,
>>
>> We agreed that remote stats should be shown also for non supported
>> actions and limits, because its remote. So it makes this condition
>> different for remote stats. Rethinking about it, maybe that's wrong. I
>> mean if we had here reload actions as a result of remote driver, they
>> have common device, so it has to be the same type of driver and support
>> same actions/limits, right ?
>>
> Obviously it runs the same device but.. technically, couldn't the remote
> device be running a different version of the driver? i.e. what if it
> supports some new mode that this host doesn't yet understand? (or does
> understand but has a driver which doesn't yet?)


Yes, also there is a possibility that one host function has privilege to 
do an action that the other doesn't have.  I see there are reasons to 
keep this diff between remote stats and local. I will keep it. Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ