[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190307094816.GA2190@nanopsycho>
Date: Thu, 7 Mar 2019 10:48:16 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org,
oss-drivers@...ronome.com
Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI
ports
Wed, Mar 06, 2019 at 06:56:38PM CET, jakub.kicinski@...ronome.com wrote:
>On Wed, 6 Mar 2019 13:20:37 +0100, Jiri Pirko wrote:
>> Tue, Mar 05, 2019 at 06:15:34PM CET, jakub.kicinski@...ronome.com wrote:
>> >On Tue, 5 Mar 2019 12:06:01 +0100, Jiri Pirko wrote:
>> >> >> >as ports. Can we invent a new command (say "partition"?) that'd take
>> >> >> >the bus info where the partition is to be spawned?
>> >> >>
>> >> >> Got it. But the question is how different this object would be from the
>> >> >> existing "port" we have today.
>> >> >
>> >> >They'd be where "the other side of a PCI link" is represented,
>> >> >restricting ports to only ASIC's forwarding plane ports.
>> >>
>> >> Basically a "host port", right? It can still be the same port object,
>> >> only with different flavour and attributes. So we would have:
>> >>
>> >> 1) pci/0000:05:00.0/0: type eth netdev enp5s0np0
>> >> flavour physical switch_id 00154d130d2f
>> >> 2) pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0
>> >> flavour pci_pf pf 0 subport 0
>> >> switch_id 00154d130d2f
>> >> peer pci/0000:05:00.0/1
>> >> 3) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0vf0
>> >> flavour pci_vf pf 0 vf 0
>> >> switch_id 00154d130d2f
>> >> peer pci/0000:05:10.1/0
>> >> 4) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0s1
>> >> flavour pci_pf pf 0 subport 1
>> >> switch_id 00154d130d2f
>> >> peer pci/0000:05:00.0/2
>> >> 5) pci/0000:05:00.0/1: type eth netdev enp5s0f0??
>> >> flavour host <----------------
>> >> peer pci/0000:05:00.0/10000
>> >> 6) pci/0000:05:10.1/0: type eth netdev enp5s10f0
>> >> flavour host <----------------
>> >> peer pci/0000:05:00.0/10001
>> >> 7) pci/0000:05:00.0/2: type eth netdev enp5s0f0??
>> >> flavour host <----------------
>> >> peer pci/0000:05:00.0/10001
>> >>
>> >> I think it looks quite clear, it gives complete topology view.
>> >
>> >Okay, I have some of questions :)
>> >
>> >What do we use for port_index?
>>
>> That is just a number totally in control of the driver. Driver can
>> assign it in any way.
>>
>> >
>> >What are the operations one can perform on "host ports"?
>>
>> That is a good question. I would start with *none* and extend it upon
>> needs.
>>
>>
>> >
>> >If we have PCI parameters, do they get set on the ASIC side of the port
>> >or the host side of the port?
>>
>> Could you give me an example?
>
>Let's take msix_vec_per_pf_min as an example.
>
>> But I believe that on switch-port side.
>
>Ok.
>
>> >How do those behave when device is passed to VM?
>>
>> In case of VF? VF will have separate devlink instance (separate handle,
>> probably "aliased" to the PF handle). So it would disappear from
>> baremetal and appear in VM:
>> $ devlink dev
>> pci/0000:00:10.0
>> $ devlink dev port
>> pci/0000:00:10.1/0: type eth netdev enp5s10f0
>> flavour host
>> That's it for the VM.
>>
>> There's no linkage (peer, alias) between this and the instances on
>> baremetal.
>
>Ok, I guess this is the main advantage from your perspective?
>The fact that "host ports" are visible inside a VM?
Yep. Also on baremetal.
>Or do you believe that having both ends of a pipe as ports makes the
>topology easier to understand?
That as well.
>
>For creating subdevices, I don't think the handle should ever be port.
>We create new ports on a devlink instance, and configure its forwarding
Okay I agree. Something like:
$ devlink port add pci/0000:00:10.0 .....
It's a bit confusing because "set" accepts port handle (like
pci/0000:00:10.0/1). Probably better would be:
$ devlink dev port add pci/0000:00:10.0 .....
>with offloads of well established Linux SW constructs. New devices are
>not logically associated with other ports (see how in my patches there
>are 2 "subports" but no main port on that PF - a split not a hierarchy).
Right, basically you have 2 equal objects. Makes sense.
>
>How we want to model forwarding inside a VM (who configures the
>underlying switching) remains unclear.
I don't understand. Could you elaborate a bit?
>
>> >You have a VF devlink instance there - what ports does it show?
>>
>> See above.
>>
>>
>> >
>> >How do those look when the PF is connected to another host? Do they
>> >get spawned at all?
>>
>> What do you mean by "PF is connected to another host"?
>
>Either "SmartNIC":
>
>http://www.mellanox.com/products/smartnic/?ls=gppc&lsd=SmartNIC-gen-smartnic&gclid=EAIaIQobChMIxIrGmYju4AIVy5yzCh2SFwQJEAAYASAAEgIui_D_BwE
>
>or
>
>Multi-host NIC: http://www.mellanox.com/page/multihost
Right. So in this case, I think that the hostport on specific host
should see devlink instance and the hostport. However, the switchports
should be only on one selected host (I don't see how to do that
differently)
>
>> >Will this not be confusing to DSA folks who have a CPU port?
>>
>> Why do you think so?
>
>Host and CPU sound quite similar, it is unclear how they differ, and
>why we have a need for both (from user's perspective).
Hmm, dsa cpu port is something different. It does not have netdev
associated with it. It is just a port which is physically used in order
to send or receive packets on switch ports.
However in our hostport case, it has user facing netdev associated and
user actually uses it to send and receive packets directly (assigns ip
etc).
Powered by blists - more mailing lists