[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190304111902.GX2314@nanopsycho>
Date: Mon, 4 Mar 2019 12:19:02 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org,
oss-drivers@...ronome.com
Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
ports
Sat, Mar 02, 2019 at 08:48:47PM CET, jakub.kicinski@...ronome.com wrote:
>On Sat, 2 Mar 2019 10:41:16 +0100, Jiri Pirko wrote:
>> Fri, Mar 01, 2019 at 07:04:50PM CET, jakub.kicinski@...ronome.com wrote:
>> >PCI endpoint corresponds to a PCI device, but such device
>> >can have one more more logical device ports associated with it.
>> >We need a way to distinguish those. Add a PCI subport in the
>> >dumps and print the info in phys_port_name appropriately.
>> >
>> >This is not equivalent to port splitting, there is no split
>> >group. It's just a way of representing multiple netdevs on
>> >a single PCI function.
>> >
>> >Note that the quality of being multiport pertains only to
>> >the PCI function itself. A PF having multiple netdevs does
>> >not mean that its VFs will also have multiple, or that VFs
>> >are associated with any particular port of a multiport VF.
>> >
>> >Example (bus 05 device has subports, bus 82 has only one port per
>> >function):
>> >
>> >$ devlink port
>> >pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical
>> >pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0
>> >pci/0000:05:00.0/4: type eth netdev enp5s0np1 flavour physical
>> >pci/0000:05:00.0/11000: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1
>>
>> So these subport devlink ports are eswitch ports for subports, right?
>>
>> Please see the following drawing:
>>
>> +---+ +---+ +---+
>> pfsub| 5 | vf| 6 | | 7 |pfsub
>> +-+-+ +-+-+ +-+-+
>> physical link <---------+ | | |
>> | | | |
>> | | | |
>> | | | |
>> +-+-+ +-+-+ +-+-+ +-+-+
>> | 1 | | 2 | | 3 | | 4 |
>> +--+---+------+---+------+---+------+---+--+
>> | physical pfsub vf pfsub |
>> | port port port port |
>> | |
>> | eswitch |
>> | |
>> | |
>> +------------------------------------------+
>>
>> 1) pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2f
>> 2) pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0 switch_id 00154d130d2f
>> 3) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0vf0 flavour pci_vf pf 0 vf 0 switch_id 00154d130d2f
>> 4) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1 switch_id 00154d130d2f
>>
>> This is basically what you have and I think we are in sync with that.
>> But what about 5,6,7? Should they have devlink port instances too?
>>
>> 5) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 0
>> 6) pci/0000:05:10.1/0: type eth netdev enp5s10f0 flavour ???? pf 0 vf 0
>> 7) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 1
>>
>> These are the "peers".
>> I think that there could be flavours "pci_pf" and "pci_vf". Then the
>> "representors" (switch ports) could have flavours "pci_pf_port" and
>> "pci_vf_port" or something like that. User can see right away
>> that is not "PF" of "VF" but rather something "on the other end".
>> Note there is no "switch_id" for these devlink ports that tells the user
>> these devlink ports are not part of any switch.
>> What do you think?
>
>Hmmm.. Hm. Hm.
>
>To me its neat if the devlink instance matches an ASIC. I think it's
>kind of clear for people to understand what it stands for then. So if
>we wanted to do the above we'd have to make the switch_id the first
>class identifier for devlink instances, rather than the bus? But then
>VF instances don't have a switch ID so that doesn't work...
>
>I need to think about it.
>
>It's also kind of strange that we have to add the noun *port* to the
>flavour of... a port... So I would prefer not to have those showing up
>as ports. Can we invent a new command (say "partition"?) that'd take
>the bus info where the partition is to be spawned?
Devlink does not supposed to be only there for switches. From the
beginning the design was to handle cases where the netdev/ib_dev is not
the correct handle. Not only in case you have multiple instances (ports)
for one ASIC, but also in case you have only one. Example use case is
port-type-change (eth->ib,ib->eth).
I chose word "port" as the parent devlink instance is "dev" and if you
partition the ASIC you basically got "ports", each of a different flavour.
And as you said, devlink instance matches one ASIC. Therefore the
devlink instance should contain all bits there are part of that ASIC,
not only switch/eswitch ports. That would be very limitting.
Powered by blists - more mailing lists