lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210203105102.71e6fa2d@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Wed, 3 Feb 2021 10:51:02 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Saeed Mahameed <saeed@...nel.org>
Cc:     Yishai Hadas <yishaih@...dia.com>, netdev@...r.kernel.org,
        davem@...emloft.net, parav@...dia.com
Subject: Re: [PATCH net-next 0/2] devlink: Add port function attribute to
 enable/disable roce

On Tue, 02 Feb 2021 20:13:48 -0800 Saeed Mahameed wrote:
> On Tue, 2021-02-02 at 18:14 -0800, Jakub Kicinski wrote:
> > On Mon, 1 Feb 2021 19:51:50 +0200 Yishai Hadas wrote:  
> > > Currently mlx5 PCI VF and SF are enabled by default for RoCE
> > > functionality.
> > > 
> > > Currently a user does not have the ability to disable RoCE for a
> > > PCI
> > > VF/SF device before such device is enumerated by the driver.
> > > 
> > > User is also incapable to do such setting from smartnic scenario
> > > for a
> > > VF from the smartnic.
> > > 
> > > Current 'enable_roce' device knob is limited to do setting only at
> > > driverinit time. By this time device is already created and
> > > firmware has
> > > already allocated necessary system memory for supporting RoCE.
> > > 
> > > When a RoCE is disabled for the PCI VF/SF device, it saves 1 Mbyte
> > > of
> > > system memory per function. Such saving is helpful when running on
> > > low
> > > memory embedded platform with many VFs or SFs.
> > > 
> > > Therefore, it is desired to empower user to disable RoCE
> > > functionality
> > > before a PCI SF/VF device is enumerated.  
> > 
> > You say that the user on the VF/SF side wants to save memory, yet
> > the control knob is on the eswitch instance side, correct?
> >   
> 
> yes, user in this case is the admin, who controls the provisioned
> network function SF/VFs.. by turning off this knob it allows to create
> more of that resource in case the user/admin is limited by memory.

Ah, so in case of the SmartNIC this extra memory is allocated on the
control system, not where the function resides?

My next question is regarding the behavior on the target system - what
does "that user" see? Can we expect they will understand that the
limitation was imposed by the admin and not due to some initialization
failure or SW incompatibility?

> > > This is achieved by extending existing 'port function' object to
> > > control
> > > capabilities of a function. This enables users to control
> > > capability of
> > > the device before enumeration.
> > > 
> > > Examples when user prefers to disable RoCE for a VF when using
> > > switchdev
> > > mode:
> > > 
> > > $ devlink port show pci/0000:06:00.0/1
> > > pci/0000:06:00.0/1: type eth netdev pf0vf0 flavour pcivf controller
> > > 0
> > > pfnum 0 vfnum 0 external false splittable false
> > >   function:
> > >     hw_addr 00:00:00:00:00:00 roce on
> > > 
> > > $ devlink port function set pci/0000:06:00.0/1 roce off
> > >   
> > > $ devlink port show pci/0000:06:00.0/1
> > > pci/0000:06:00.0/1: type eth netdev pf0vf0 flavour pcivf controller
> > > 0
> > > pfnum 0 vfnum 0 external false splittable false
> > >   function:
> > >     hw_addr 00:00:00:00:00:00 roce off
> > > 
> > > FAQs:
> > > -----
> > > 1. What does roce on/off do?
> > > Ans: It disables RoCE capability of the function before its
> > > enumerated,
> > > so when driver reads the capability from the device firmware, it is
> > > disabled.
> > > At this point RDMA stack will not be able to create UD, QP1, RC,
> > > XRC
> > > type of QPs. When RoCE is disabled, the GID table of all ports of
> > > the
> > > device is disabled in the device and software stack.
> > > 
> > > 2. How is the roce 'port function' option different from existing
> > > devlink param?
> > > Ans: RoCE attribute at the port function level disables the RoCE
> > > capability at the specific function level; while enable_roce only
> > > does
> > > at the software level.
> > > 
> > > 3. Why is this option for disabling only RoCE and not the whole
> > > RDMA
> > > device?
> > > Ans: Because user still wants to use the RDMA device for non RoCE
> > > commands in more memory efficient way.  
> > 
> > What are those "non-RoCE commands" that user may want to use "in a
> > more
> > efficient way"?  
> 
> RAW eth QP, i think you already know this one, it is a very thin layer
> that doesn't require the whole rdma stack.

Sorry for asking a leading question. You know how we'll feel about
that one, do we need to talk this out or can we save ourselves the
battle? :S

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ