lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM8PR12MB5480BE54D27770DEB39EA009DC369@DM8PR12MB5480.namprd12.prod.outlook.com>
Date:   Wed, 9 Jun 2021 09:24:03 +0000
From:   Parav Pandit <parav@...dia.com>
To:     Yunsheng Lin <linyunsheng@...wei.com>,
        "dsahern@...il.com" <dsahern@...il.com>,
        "stephen@...workplumber.org" <stephen@...workplumber.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC:     Jiri Pirko <jiri@...dia.com>,
        "moyufeng@...wei.com" <moyufeng@...wei.com>,
        "linuxarm@...neuler.org" <linuxarm@...neuler.org>
Subject: RE: Re: [PATCH RESEND iproute2-next] devlink: Add optional controller
 user input



> From: Yunsheng Lin <linyunsheng@...wei.com>
> Sent: Tuesday, June 8, 2021 3:02 PM
> 
> On 2021/6/8 16:47, Parav Pandit wrote:
> >> From: Yunsheng Lin <linyunsheng@...wei.com>
> >> Sent: Tuesday, June 8, 2021 1:06 PM
> >>
> >> On 2021/6/8 13:26, Parav Pandit wrote:
> >>>> From: Yunsheng Lin <linyunsheng@...wei.com>
> >>>> Sent: Tuesday, June 8, 2021 8:58 AM
> >>>>
> >>>> On 2021/6/7 19:12, Parav Pandit wrote:
> >>>>>> From: Yunsheng Lin <linyunsheng@...wei.com>
> >>>>>> Sent: Monday, June 7, 2021 4:27 PM
> >>>>>>
> >>>>
> >>>> [..]
> >>>>
> >>>>>>>
> >>>>>>>> 2. each PF's devlink instance has three types of port, which is
> >>>>>>>>    FLAVOUR_PHYSICAL, FLAVOUR_PCI_PF and
> >>>>>> FLAVOUR_PCI_VF(supposing I
> >>>>>>>> understand
> >>>>>>>>    port flavour correctly).
> >>>>>>>>
> >>>>>>> FLAVOUR_PCI_{PF,VF,SF} belongs to eswitch (representor) side on
> >>>>>> switchdev device.
> >>>>>>
> >>>>>> If devlink instance or eswitch is in
> DEVLINK_ESWITCH_MODE_LEGACY
> >>>>>> mode, the FLAVOUR_PCI_{PF,VF,SF} port instance does not need to
> >>>> created?
> >>>>> No. in eswitch legacy, there are no representor netdevice or
> >>>>> devlink
> >> ports.
> >>>>
> >>>> It seems each devlink port instance corresponds to a netdevice.
> >>>> More specificly, the devlink instance is created in the struct
> >>>> pci_driver' probe function of a pci function, a devlink port
> >>>> instance is created and registered to that devlink instance when a
> >>>> netdev of that
> >> pci function is created?
> >>>>
> >>> Yes.
> >>>
> >>>> As in diagram [1], the devlink port instance(flavour
> >>>> FLAVOUR_PHYSICAL) for
> >>>> ctrl-0-pf0 is created when the netdev of ctrl-0-pf0 is created in
> >>>> the host of smartNIC, the devlink port instance(flavour
> >>>> FLAVOUR_VIRTUAL) for ctrl-0- pf0vfN is created when the netdev of
> >>>> ctrl-0-pf0vfN is created in the host of smartNIC, right?
> >>>>
> >>> Ctrl-0-pf0vfN, ctrl-0-pf0 ports are eswitch ports. They are created
> >>> where
> >> there is eswitch.
> >>> Usually in smartnic where eswitch is located.
> >>
> >> Does diagram in [1] corresponds to the multi-host (two) host setup as
> >> memtioned previously?
> >> H1.pf0.phyical_port = p0.
> >> H1.pf1.phyical_port = p1.
> >> H2.pf0.phyical_port = p0.
> >> H2.pf1.phyical_port = p1.
> >>
> > Yes.
> >
> >> Let's say H1 = server and H2 = smartNIC as the pci rc connected to below:
> >>                  ---------------------------------------------------------
> >>                  |                                                       |
> >>                  |           --------- ---------         ------- ------- |
> >>     -----------  |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
> >>     | server  |  | -------   ----/---- ---/----- ------- ---/--- ---/--- |
> >>     | pci rc  |=== | pf0 |______/________/       | pf1 |___/_______/     |
> >>     | connect |  | -------                       -------                 |
> >>     -----------  |     | controller_num=1 (no eswitch)                   |
> >>                  ------|--------------------------------------------------
> >>                  (internal wire)
> >>                        |
> >>                  ---------------------------------------------------------
> >>                  | devlink eswitch ports and reps                        |
> >>                  | ----------------------------------------------------- |
> >>                  | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | |
> >>                  | |pf0    | pf0vfN | pf0sfN | pf1    | pf1vfN |pf1sfN | |
> >>                  | ----------------------------------------------------- |
> >>                  | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | |
> >>                  | |pf0    | pf0vfN | pf0sfN | pf1    | pf1vfN |pf1sfN | |
> >>                  | ----------------------------------------------------- |
> >>                  |                                                       |
> >>                  |                                                       |
> >>     -----------  |           --------- ---------         ------- ------- |
> >>     | smartNIC|  |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
> >>     | pci rc  |==| -------   ----/---- ---/----- ------- ---/--- ---/--- |
> >>     | connect |  | | pf0 |______/________/       | pf1 |___/_______/     |
> >>     -----------  | -------                       -------                 |
> >>                  |                                                       |
> >>                  |  local controller_num=0 (eswitch)                     |
> >>
> >> ---------------------------------------------------------
> >>
> >> A vanilla kernel can run on the smartNIC host, right?
> > Right.
> >
> >> what the smartNIC host see is two PF corresponding to ctrl-0-pf0 and
> >> ctrl-0-pf1 When the kernel is boot up first and mlx driver is not
> >> loaded yet, right?
> >>
> >> I am not sure it is ok to leave out the VF and SF, but let's leave
> >> them out for simplicity now.
> >> When mlx driver is loaded, two devlink instances are created, which
> >> corresponds to ctrl-0-pf0 and ctrl-0-pf1, and two devlink port
> >> instances (flavour FLAVOUR_PHYSICAL) is created and registered to
> >> corresponding devlink instances just created, right?
> >>
> >> As the eswitch mode is based on devlink instance, Let's only set the
> >> mode of ctrl-0-pf0' devlink instance to
> >> DEVLINK_ESWITCH_MODE_SWITCHDEV, the representor netdev of ctrl-1-
> pf0
> >> is created and devlink port instance of that representor netdev is
> >> created and registered to devlink instances corresponding to ctrl-0-pf0?
> >>
> >> I think I miss something here, the above does not seems right,
> >> because:
> >> 1. For single host case:the PF is not passed through to the VM, devlink
> port
> >>    instance of VF's representor netdev can be registered to the
> >> devlink instance
> >>    corresponding to it's PF, right?
> > Yes, if I understand your question right.
> >
> >> 2. But for two-host case as above, do we need to create a devlink
> instances
> >>    for the PF corresponding to ctrl-1-pf0 in smartNIC host?
> > You can choose not to create a devlink instance in external controller PF. It
> may not be even a Linux OS running there.
> >
> > I read questions few more times, but I find it hard to understand what you
> really want to ask.
> > Not sure I understood you.
> >
> > Trying again,
> >
> > The model is really very straight forward as visible in the diagram.
> >
> > There is one PF that has the eswitch. Eswitch contains representor ports.
> 
> I thought the representor ports of a PF'eswitch is decided by the function
> under a specific PF(For example, the PF itself and the VF under this PF)?

Eswitch is not per PF in context of smartnic/multi-host.
PF _has_ eswitch that contains the representor ports for PF, VF, SF.

> 
> > Each representor port represent either PF, VF or SF.
> > This PF, VF or SF can be of local controller residing on the eswitch device or
> it can be of an external controller(s).
> > Here external controller = 1.
> 
> If I understood above correctly:
> The fw/hw decide which PF has the eswitch, and how many
> devlink/representor port does this eswitch has?
Number of ports are dynamic. When new SFs/VFs are created, ports get added to the switch.

> Suppose PF0 of controller_num=0 in have the eswitch, and the eswitch may
> has devlink/representor port representing other PF, like PF1 in
> controller_num=0, and even PF0/PF1 in controller_num=1?
Yes. Correct.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ