lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PH0PR12MB5481AD558AD7A17928D78081DC6D9@PH0PR12MB5481.namprd12.prod.outlook.com>
Date:   Thu, 18 Aug 2022 16:44:00 +0000
From:   Parav Pandit <parav@...dia.com>
To:     "ecree@...inx.com" <ecree@...inx.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-net-drivers@....com" <linux-net-drivers@....com>
CC:     "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "corbet@....net" <corbet@....net>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "jacob.e.keller@...el.com" <jacob.e.keller@...el.com>,
        "jesse.brandeburg@...el.com" <jesse.brandeburg@...el.com>,
        "michael.chan@...adcom.com" <michael.chan@...adcom.com>,
        "andy@...yhouse.net" <andy@...yhouse.net>,
        "saeed@...nel.org" <saeed@...nel.org>,
        "jiri@...nulli.us" <jiri@...nulli.us>,
        "snelson@...sando.io" <snelson@...sando.io>,
        "simon.horman@...igine.com" <simon.horman@...igine.com>,
        "alexander.duyck@...il.com" <alexander.duyck@...il.com>,
        "rdunlap@...radead.org" <rdunlap@...radead.org>,
        Edward Cree <ecree.xilinx@...il.com>,
        Parav Pandit <parav@...dia.com>
Subject: RE: [RFC PATCH v2 net-next] docs: net: add an explanation of VF (and
 other) Representors


> From: ecree@...inx.com <ecree@...inx.com>
> Sent: Monday, August 15, 2022 10:23 AM
> 
> +
> +=============================
> +Network Function Representors
Given current documentation of the patch:

s/Network Function Representors/Network Function Port Representors.

> +=============================
Most of the text below describes "netdev" device of the switchdev device.

A switchdev device' netdevice only represents the port of the representee.

Hence the document title is not truly justified what it wants to say.

A _whole_ network function is represented today using 
a. netdevice represents representee's network port

b. devlink port function for function management

This good documentation is probably best placed in switchdev.rst may be with a NIC specific section.
This will connect the two dots nicely.

[..]
> +Motivation
> +----------
> +Network function representors bring the standard Linux networking stack
Network functions port representors; here and at other places ...

> +to virtual switches and IOV devices.  Just as each port of a
> +Linux-controlled switch has a separate netdev, so each virtual function
> +has one.  When the system boots, and before any offload is configured,
> +all packets from the virtual functions appear in the networking stack of the
> PF via the representors.
> +The PF can thus always communicate freely with the virtual functions.
> +The PF can configure standard Linux forwarding between representors,
> +the uplink or any other netdev (routing, bridging, TC classifiers).
> +
> +Thus, a representor is both a control plane object (representing the
> +function in administrative commands) 
Certain part of the function administrative commands that doesn't directly fit to netdev.
They are through devlink port + function.

> and a data plane object (one end of a
> virtual pipe).
> +As a virtual link endpoint, the representor can be configured like any
> +other netdevice; in some cases (e.g. link state) the representee will
> +follow the representor's configuration, while in others there are
> +separate APIs to configure the representee.
> +
> +Definitions
> +-----------
> +
> +This document uses the term "master PF" to refer to the PCIe function
s/master PF/switchdev function
or
s/master PF/switchdev

Especially when you have below text of granting these privileges to a VF/SF...
It also aligns to devlink switchdev device notion.
Less new terms for readers and admins to learn. :)

[..]
> Packets transmitted on the
> +   representor netdevice should be delivered to the representee; packets
> +   transmitted to the representee which fail to match any switching rule
> should
> +   be received on the representor netdevice.  (That is, there is a virtual pipe
> +   connecting the representor to the representee, similar in concept to a
> veth
> +   pair.)
Please add text that,
Packets transmitted by the representee and when they are not offloaded, such packets are delivered to the port representor netdevice.

> +What functions should have a representor?
> +-----------------------------------------
> +
> +Essentially, for each virtual port on the device's internal switch,
                                                                            ^^^^
You probably wanted to say master PF internal switch here.

Better to word it as, each virtual port of a switchdev, there should be...

> +there should be a representor.
> +Some vendors have chosen to omit representors for the uplink and the
> +physical network port, which can simplify usage (the uplink netdev
> +becomes in effect the physical port's representor) but does not
> +generalise to devices with multiple ports or uplinks.
> +
> +Thus, the following should all have representors:
> +
> + - VFs belonging to the master PF.
> + - Other PFs on the local PCIe controller, and any VFs belonging to them.
Local and/or external PCIe controllers.

> + - PFs and VFs on other PCIe controllers on the device (e.g. for any
> embedded
> +   System-on-Chip within the SmartNIC).
> + - PFs and VFs with other personalities, including network block devices
> (such
> +   as a vDPA virtio-blk PF backed by remote/distributed storage), if their
> +   network access is implemented through a virtual switch port.
> +   Note that such functions can require a representor despite the
> representee
> +   not having a netdev.
This looks a big undertaking to represent them via "netdevice".
Mostly they cannot be well represented by the netdevice.

Even a network function was not fully fitting into the existing port representor netdevice.
In some cases, such vDPA devices are affiliated to the switchdev, but they use one or multiple of its ports.
In second scenario, today vdpa devices are used without switchdev device too.

Configuring vDPA device's block size, capacity, iops just doesn't make sense to control via netdevice representation.

vdpa has its own subsystem [1] to manage its backend.

So please remove above vdpa related documentation that doesn't reflect the current kernel.

[1] https://man7.org/linux/man-pages/man8/vdpa.8.html

> + - Subfunctions (SFs) belonging to any of the above PFs or VFs, if they have
> +   their own port on the switch (as opposed to using their parent PF's port).
Not sure why the text has _if_ for SF and not for the VF.
Do you see a SF device in the kernel that doesn't have their own port, due to which there is _if_ added?

> + - Any accelerators or plugins on the device whose interface to the network
> is
> +   through a virtual switch port, even if they do not have a corresponding
> PCIe
> +   PF or VF.
> +
Above vdpa block is good example that cannot be represent by the netdev port representor.
I would imagine other accelerators won't be able to fit in the netdev representation.

> +How are representors created?
> +-----------------------------
> +
> +The driver instance attached to the master PF should enumerate the
> +virtual ports on the switch, and for each representee, create a
> +pure-software netdevice which has some form of in-kernel reference to
> +the PF's own netdevice or driver private data (``netdev_priv()``).
Today a user can create new virtual ports. Hence, these port represnetors and function representors are created dynamically without enumeration.
Please add text describing both ways.

For mlx5 case a representor netdevice has real queue from which tx/rx DMA happens from the device to/from network.
It is not entirely pure software per say.
Hence, "pure-software" is misleading. Please drop that word.

> +If switch ports can dynamically appear/disappear, the PF driver should
> +create and destroy representors appropriately.
> +The operations of the representor netdevice will generally involve
> +acting through the master PF.  For example, ``ndo_start_xmit()`` might
> +send the packet through a hardware TX queue attached to the master PF,
> +with either packet metadata or queue configuration marking it for delivery
> to the representee.
Sharing/not sharing TX and RX queue among representor netdevices is not yet well established.
Mostly my knowledge is ancient on sharing model.

In mlx5 driver queues are per representor netdevice. For others may be different.

Is there a way for the netdev to tell that it shares resources/queues with other netdevices?

So better to say:
The operations of the representor netdevice will generally involve using resources/queues attached to the netdevice.

> +
> +How are representors identified?
> +--------------------------------
> +
> +The representor netdevice should *not* directly refer to a PCIe device (e.g.
> +through ``net_dev->dev.parent`` / ``SET_NETDEV_DEV()``), either of the
> +representee or of the master PF.
This isn't true.
Representor netdevices are connected to the switchdev device PCI function.
Without linking to PCI device, udev scriptology needs to grep among thousands of netdevices and its very inefficient.

May be a better code in future to take the dev.parent directly from the devlink port a netdevice is attached to.

> +There are as yet no established conventions for naming representors
> +which do not correspond to PCIe functions (e.g. accelerators and plugins).
Netdevice represents the networking port of the function.
So, this is not applicable text here.

> +Configuring the representee's MAC
> +---------------------------------
> +
> +
> +Currently there is no way to use the representor to set the station
> +permanent MAC address of the representee; other methods available to
> do this include:
Just rewrite above line as:
Representee's MAC can be configured using ..

Overall looks good doc to add apart from above comments to fix.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ