lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 1 Aug 2018 17:36:41 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Saeed Mahameed <saeedm@....mellanox.co.il>
Cc:     Saeed Mahameed <saeedm@...lanox.com>,
        "David S. Miller" <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>, Jiri Pirko <jiri@...lanox.com>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>,
        Bjorn Helgaas <helgaas@...nel.org>
Subject: Re: [pull request][net-next 00/10] Mellanox, mlx5 and devlink updates 2018-07-31

On Wed, Aug 1, 2018 at 4:13 PM, Saeed Mahameed
<saeedm@....mellanox.co.il> wrote:
> On Wed, Aug 1, 2018 at 3:34 PM, Alexander Duyck
> <alexander.duyck@...il.com> wrote:
>> On Wed, Aug 1, 2018 at 2:52 PM, Saeed Mahameed <saeedm@...lanox.com> wrote:
>>> Hi Dave,
>>>
>>> This series provides devlink parameters updates to both devlink API and
>>> mlx5 driver, it is a 2nd iteration of the dropped patches sent in a previous
>>> mlx5 submission "net/mlx5: Support PCIe buffer congestion handling via
>>> Devlink" to address review comments [1].
>>>
>>> Changes from the original series:
>>> - According to the discussion outcome, we are keeping the congestion control
>>>   setting as mlx5 device specific for the current HW generation.
>>> - Changed the congestion_mode and congestion action param type to string
>>> - Added patches to fix devlink handling of param type string
>>> - Added a patch which adds extack messages support for param set.
>>> - At the end of this series, I've added yet another mlx5 devlink related
>>>  feature, firmware snapshot support.
>>>
>>> For more information please see tag log below.
>>>
>>> Please pull and let me know if there's any problem.
>>>
>>> [1] https://patchwork.ozlabs.org/patch/945996/
>>>
>>> Thanks,
>>> Saeed.
>>>
>>> ---
>>>
>>> The following changes since commit e6476c21447c4b17c47e476aade6facf050f31e8:
>>>
>>>   net: remove bogus RCU annotations on socket.wq (2018-07-31 12:40:22 -0700)
>>>
>>> are available in the Git repository at:
>>>
>>>   git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2018-08-01
>>>
>>> for you to fetch changes up to 2ac6108c65ffcb1e5eab1fba1fd59272604d1c32:
>>>
>>>   net/mlx5: Use devlink region_snapshot parameter (2018-08-01 14:49:09 -0700)
>>>
>>> ----------------------------------------------------------------
>>> mlx5-updates-2018-08-01
>>>
>>> This series provides devlink parameters updates to both devlink API and
>>> mlx5 driver,
>>>
>>> 1) Devlink changes: (Moshe Shemesh)
>>> The first two patches fix devlink param infrastructure for string type
>>> params.
>>> The third patch adds a devlink helper function to safely copy string from
>>> driver to devlink.
>>> The forth patch adds extack support for param set.
>>>
>>> 2) mlx5 specific congestion parameters: (Eran Ben Elisha)
>>> Next three patches add new devlink driver specific params for controlling
>>> congestion action and mode, using string type params and extack messages support.
>>>
>>> This congestion mode enables hw workaround in specific devices which is
>>> controlled by devlink driver-specific params. The workaround is device
>>> specific for this NIC generation, the next NIC will not need it.
>>>
>>> Congestion parameters:
>>>  - Congestion action
>>>             HW W/A mechanism in the PCIe buffer which monitors the amount of
>>>             consumed PCIe buffer per host.  This mechanism supports the
>>>             following actions in case of threshold overflow:
>>>             - Disabled - NOP (Default)
>>>             - Drop
>>>             - Mark - Mark CE bit in the CQE of received packet
>>>     - Congestion mode
>>>             - Aggressive - Aggressive static trigger threshold (Default)
>>>             - Dynamic - Dynamically change the trigger threshold
>>>
>>> 3) mlx5 firmware snapshot support via devlink: (Alex Vesker)
>>> Last three patches, add the support for capturing region snapshot of the
>>> firmware crspace during critical errors, using devlink region_snapshot
>>> parameter.
>>>
>>> -Saeed.
>>>
>>> ----------------------------------------------------------------
>>> Alex Vesker (3):
>>>       net/mlx5: Add Vendor Specific Capability access gateway
>>>       net/mlx5: Add Crdump FW snapshot support
>>>       net/mlx5: Use devlink region_snapshot parameter
>>>
>>> Eran Ben Elisha (3):
>>>       net/mlx5: Move all devlink related functions calls to devlink.c
>>>       net/mlx5: Add MPEGC register configuration functionality
>>>       net/mlx5: Enable PCIe buffer congestion handling workaround via devlink
>>>
>>> Moshe Shemesh (4):
>>>       devlink: Fix param set handling for string type
>>>       devlink: Fix param cmode driverinit for string type
>>>       devlink: Add helper function for safely copy string param
>>>       devlink: Add extack messages support to param set
>>>
>>>  drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c  |   3 +-
>>>  drivers/net/ethernet/mellanox/mlx4/main.c          |   6 +-
>>>  drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   3 +-
>>>  drivers/net/ethernet/mellanox/mlx5/core/devlink.c  | 388 +++++++++++++++++++++
>>>  drivers/net/ethernet/mellanox/mlx5/core/devlink.h  |  13 +
>>>  .../net/ethernet/mellanox/mlx5/core/diag/crdump.c  | 223 ++++++++++++
>>>  drivers/net/ethernet/mellanox/mlx5/core/health.c   |   3 +
>>>  drivers/net/ethernet/mellanox/mlx5/core/lib/mlx5.h |   4 +
>>>  .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c  | 320 +++++++++++++++++
>>>  .../net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h  |  56 +++
>>>  drivers/net/ethernet/mellanox/mlx5/core/main.c     |  10 +-
>>>  include/linux/mlx5/driver.h                        |   5 +
>>>  include/net/devlink.h                              |  15 +-
>>>  net/core/devlink.c                                 |  44 ++-
>>>  14 files changed, 1076 insertions(+), 17 deletions(-)
>>>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.c
>>>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/devlink.h
>>>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.c
>>>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
>>>  create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.h
>>
>> So after looking over the patch set the one thing I would ask for in
>> this is some sort of documentation at a minimum. As a user I don't see
>> how you can expect someone to be able to use this when the naming of
>> things are pretty cryptic and there is no real explanation anywhere if
>> you don't go through and read the patch description itself. When you
>> start adding driver specific interfaces, you should at least start
>> adding vendor specific documentation.
>>
>
> Sure, sounds like a great idea, something like:
> Documentation/networking/mlx5.txt and have a devlink section ?
> or have a generic devlink doc and a mlx5 section in it ?

Either would work for me.

>> Also I don't see how using a vendor specific configuration space
>> section can be done without adding some tie-ins to the PCI core files
>> because it should be possible to race with someone poking at the
>> register space via something like setpci/lspci. Also one of the things
>> that came up was that drivers are not supposed to be banging on the
>> PCI configuration space at will, and it seems like this patch set is
>> doing exactly that through the VSC block.
>>
>
> this is a whole different feature than the device specific parameters.
> The whole vendor specific configuration space access is needed only
> for diagnostic/dump
> purposes when something really bad happens and the command interface
> with FW is down,
> and when the FW is un-responsive, we want to dump the crspace into the
> already existing devlink
> crdump buffer, how do you expect us to read it if we are not allowed
> to access it ?
>
> What do you mean by tie-ins to the PCI core files ? can you please elaborate ?

You have added a vendor specific config section and you are using it
to access several of the pieces of metadata. The setup isn't too
different than the VPD setup and approach. However I don't see many of
the protections that exist for VPD in place for this vendor specific
configuration. As such I have concerns. For example what is to keep
requests to the various devlink interfaces from racing with each other
when they both end up operating through the VCS?

- Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ