lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 02 Feb 2021 20:13:48 -0800
From:   Saeed Mahameed <saeed@...nel.org>
To:     Jakub Kicinski <kuba@...nel.org>, Yishai Hadas <yishaih@...dia.com>
Cc:     netdev@...r.kernel.org, davem@...emloft.net, parav@...dia.com
Subject: Re: [PATCH net-next 0/2] devlink: Add port function attribute to
 enable/disable roce

On Tue, 2021-02-02 at 18:14 -0800, Jakub Kicinski wrote:
> On Mon, 1 Feb 2021 19:51:50 +0200 Yishai Hadas wrote:
> > Currently mlx5 PCI VF and SF are enabled by default for RoCE
> > functionality.
> > 
> > Currently a user does not have the ability to disable RoCE for a
> > PCI
> > VF/SF device before such device is enumerated by the driver.
> > 
> > User is also incapable to do such setting from smartnic scenario
> > for a
> > VF from the smartnic.
> > 
> > Current 'enable_roce' device knob is limited to do setting only at
> > driverinit time. By this time device is already created and
> > firmware has
> > already allocated necessary system memory for supporting RoCE.
> > 
> > When a RoCE is disabled for the PCI VF/SF device, it saves 1 Mbyte
> > of
> > system memory per function. Such saving is helpful when running on
> > low
> > memory embedded platform with many VFs or SFs.
> > 
> > Therefore, it is desired to empower user to disable RoCE
> > functionality
> > before a PCI SF/VF device is enumerated.
> 
> You say that the user on the VF/SF side wants to save memory, yet
> the control knob is on the eswitch instance side, correct?
> 

yes, user in this case is the admin, who controls the provisioned
network function SF/VFs.. by turning off this knob it allows to create
more of that resource in case the user/admin is limited by memory.

> > This is achieved by extending existing 'port function' object to
> > control
> > capabilities of a function. This enables users to control
> > capability of
> > the device before enumeration.
> > 
> > Examples when user prefers to disable RoCE for a VF when using
> > switchdev
> > mode:
> > 
> > $ devlink port show pci/0000:06:00.0/1
> > pci/0000:06:00.0/1: type eth netdev pf0vf0 flavour pcivf controller
> > 0
> > pfnum 0 vfnum 0 external false splittable false
> >   function:
> >     hw_addr 00:00:00:00:00:00 roce on
> > 
> > $ devlink port function set pci/0000:06:00.0/1 roce off
> >   
> > $ devlink port show pci/0000:06:00.0/1
> > pci/0000:06:00.0/1: type eth netdev pf0vf0 flavour pcivf controller
> > 0
> > pfnum 0 vfnum 0 external false splittable false
> >   function:
> >     hw_addr 00:00:00:00:00:00 roce off
> > 
> > FAQs:
> > -----
> > 1. What does roce on/off do?
> > Ans: It disables RoCE capability of the function before its
> > enumerated,
> > so when driver reads the capability from the device firmware, it is
> > disabled.
> > At this point RDMA stack will not be able to create UD, QP1, RC,
> > XRC
> > type of QPs. When RoCE is disabled, the GID table of all ports of
> > the
> > device is disabled in the device and software stack.
> > 
> > 2. How is the roce 'port function' option different from existing
> > devlink param?
> > Ans: RoCE attribute at the port function level disables the RoCE
> > capability at the specific function level; while enable_roce only
> > does
> > at the software level.
> > 
> > 3. Why is this option for disabling only RoCE and not the whole
> > RDMA
> > device?
> > Ans: Because user still wants to use the RDMA device for non RoCE
> > commands in more memory efficient way.
> 
> What are those "non-RoCE commands" that user may want to use "in a
> more
> efficient way"?

RAW eth QP, i think you already know this one, it is a very thin layer
that doesn't require the whole rdma stack.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ