[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190305110601.GC2314@nanopsycho>
Date: Tue, 5 Mar 2019 12:06:01 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org,
oss-drivers@...ronome.com
Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
ports
Tue, Mar 05, 2019 at 01:33:02AM CET, jakub.kicinski@...ronome.com wrote:
>On Mon, 4 Mar 2019 08:56:09 +0100, Jiri Pirko wrote:
>> Sat, Mar 02, 2019 at 08:48:47PM CET, jakub.kicinski@...ronome.com wrote:
>> >On Sat, 2 Mar 2019 10:41:16 +0100, Jiri Pirko wrote:
>> >> Fri, Mar 01, 2019 at 07:04:50PM CET, jakub.kicinski@...ronome.com wrote:
>> >> >PCI endpoint corresponds to a PCI device, but such device
>> >> >can have one more more logical device ports associated with it.
>> >> >We need a way to distinguish those. Add a PCI subport in the
>> >> >dumps and print the info in phys_port_name appropriately.
>> >> >
>> >> >This is not equivalent to port splitting, there is no split
>> >> >group. It's just a way of representing multiple netdevs on
>> >> >a single PCI function.
>> >> >
>> >> >Note that the quality of being multiport pertains only to
>> >> >the PCI function itself. A PF having multiple netdevs does
>> >> >not mean that its VFs will also have multiple, or that VFs
>> >> >are associated with any particular port of a multiport VF.
>> >> >
>> >> >Example (bus 05 device has subports, bus 82 has only one port per
>> >> >function):
>> >> >
>> >> >$ devlink port
>> >> >pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical
>> >> >pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0
>> >> >pci/0000:05:00.0/4: type eth netdev enp5s0np1 flavour physical
>> >> >pci/0000:05:00.0/11000: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1
>> >>
>> >> So these subport devlink ports are eswitch ports for subports, right?
>> >>
>> >> Please see the following drawing:
>> >>
>> >> +---+ +---+ +---+
>> >> pfsub| 5 | vf| 6 | | 7 |pfsub
>> >> +-+-+ +-+-+ +-+-+
>> >> physical link <---------+ | | |
>> >> | | | |
>> >> | | | |
>> >> | | | |
>> >> +-+-+ +-+-+ +-+-+ +-+-+
>> >> | 1 | | 2 | | 3 | | 4 |
>> >> +--+---+------+---+------+---+------+---+--+
>> >> | physical pfsub vf pfsub |
>> >> | port port port port |
>> >> | |
>> >> | eswitch |
>> >> | |
>> >> | |
>> >> +------------------------------------------+
>> >>
>> >> 1) pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2f
>> >> 2) pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0 switch_id 00154d130d2f
>> >> 3) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0vf0 flavour pci_vf pf 0 vf 0 switch_id 00154d130d2f
>> >> 4) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1 switch_id 00154d130d2f
>> >>
>> >> This is basically what you have and I think we are in sync with that.
>> >> But what about 5,6,7? Should they have devlink port instances too?
>> >>
>> >> 5) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 0
>> >> 6) pci/0000:05:10.1/0: type eth netdev enp5s10f0 flavour ???? pf 0 vf 0
>> >> 7) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 1
>> >>
>> >> These are the "peers".
>> >> I think that there could be flavours "pci_pf" and "pci_vf". Then the
>> >> "representors" (switch ports) could have flavours "pci_pf_port" and
>> >> "pci_vf_port" or something like that. User can see right away
>> >> that is not "PF" of "VF" but rather something "on the other end".
>> >> Note there is no "switch_id" for these devlink ports that tells the user
>> >> these devlink ports are not part of any switch.
>> >> What do you think?
>> >
>> >Hmmm.. Hm. Hm.
>> >
>> >To me its neat if the devlink instance matches an ASIC. I think it's
>> >kind of clear for people to understand what it stands for then. So if
>> >we wanted to do the above we'd have to make the switch_id the first
>> >class identifier for devlink instances, rather than the bus? But then
>>
>> What do you mean by "first class identifier"? Like "a handle"?
>
>Yes, a handle.
Odd.
>
>> >VF instances don't have a switch ID so that doesn't work...
>>
>> Wait a sec. VF-ports do have. VFs them selves don't.
>
>Looking at your example this one:
>
>6) pci/0000:05:10.1/0: type eth netdev enp5s10f0 flavour ???? pf 0 vf 0
>
>that uses VF's DBDF in the devlink instance handle, so I presume this
>is a VF's devlink instance that will get passed to the VM together with
>the VF device?
Yes. Correct. That this does not have switch_id.
>
>> But that is the same for PF. PF would also not have switch id.
>
>Yes :( You'd have to mark what constitutes a devlink instance on your
>drawing. The semantics for devlink instances seem to be the focal point
>of the discussion.
>
>Right now it seems a little bit that folks on the NIC side see a devlink
>instance as a PCI function and on switch side it's the whole ASIC.
I think it should be the whole ASIC for both. I don't see why not. It's
one entity, one set parameter, one flash function, one info report etc.
>
>> >I need to think about it.
>> >
>> >It's also kind of strange that we have to add the noun *port* to the
>> >flavour of... a port... So I would prefer not to have those showing up
>>
>> Yeah.
>>
>> >as ports. Can we invent a new command (say "partition"?) that'd take
>> >the bus info where the partition is to be spawned?
>>
>> Got it. But the question is how different this object would be from the
>> existing "port" we have today.
>
>They'd be where "the other side of a PCI link" is represented,
>restricting ports to only ASIC's forwarding plane ports.
Basically a "host port", right? It can still be the same port object,
only with different flavour and attributes. So we would have:
1) pci/0000:05:00.0/0: type eth netdev enp5s0np0
flavour physical switch_id 00154d130d2f
2) pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0
flavour pci_pf pf 0 subport 0
switch_id 00154d130d2f
peer pci/0000:05:00.0/1
3) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0vf0
flavour pci_vf pf 0 vf 0
switch_id 00154d130d2f
peer pci/0000:05:10.1/0
4) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0s1
flavour pci_pf pf 0 subport 1
switch_id 00154d130d2f
peer pci/0000:05:00.0/2
5) pci/0000:05:00.0/1: type eth netdev enp5s0f0??
flavour host <----------------
peer pci/0000:05:00.0/10000
6) pci/0000:05:10.1/0: type eth netdev enp5s10f0
flavour host <----------------
peer pci/0000:05:00.0/10001
7) pci/0000:05:00.0/2: type eth netdev enp5s0f0??
flavour host <----------------
peer pci/0000:05:00.0/10001
I think it looks quite clear, it gives complete topology view.
>
>> >My next goal is to find a way of grouping multiple bus devices under one
>> >"ASIC" (which is a devlink instance to me) so it can be understood
>> >easily how things are laid out when there is more than one PF connected
>> >to one host.
>>
>> These are the "aliases" you mentioned before right? Makes sense.
>
>Yes.
Powered by blists - more mailing lists