[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b34d8427-51c0-0bbd-471e-1af30375c702@gmail.com>
Date: Wed, 18 Nov 2020 11:03:24 -0700
From: David Ahern <dsahern@...il.com>
To: Parav Pandit <parav@...dia.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
Cc: Jiri Pirko <jiri@...dia.com>, Jason Gunthorpe <jgg@...dia.com>,
"dledford@...hat.com" <dledford@...hat.com>,
Leon Romanovsky <leonro@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>,
"kuba@...nel.org" <kuba@...nel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
Vu Pham <vuhuong@...dia.com>
Subject: Re: [PATCH net-next 03/13] devlink: Support add and delete devlink
port
On 11/18/20 10:02 AM, Parav Pandit wrote:
>
>> From: David Ahern <dsahern@...il.com>
>> Sent: Wednesday, November 18, 2020 9:51 PM
>>
>> On 11/12/20 12:24 PM, Parav Pandit wrote:
>>> Extended devlink interface for the user to add and delete port.
>>> Extend devlink to connect user requests to driver to add/delete such
>>> port in the device.
>>>
>>> When driver routines are invoked, devlink instance lock is not held.
>>> This enables driver to perform several devlink objects registration,
>>> unregistration such as (port, health reporter, resource etc) by using
>>> exising devlink APIs.
>>> This also helps to uniformly use the code for port unregistration
>>> during driver unload and during port deletion initiated by user.
>>>
>>> Examples of add, show and delete commands:
>>> $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
>>>
>>> $ devlink port show
>>> pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical
>>> port 0 splittable false
>>>
>>> $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
>>>
>>> $ devlink port show pci/0000:06:00.0/32768
>>> pci/0000:06:00.0/32768: type eth netdev eth0 flavour pcisf controller 0
>> pfnum 0 sfnum 88 external false splittable false
>>> function:
>>> hw_addr 00:00:00:00:88:88 state inactive opstate detached
>>>
>>
>> There has to be limits on the number of sub functions that can be created for
>> a device. How does a user find that limit?
> Yes, this came up internally, but didn't really converged.
> The devlink resource looked too verbose for an average or simple use cases.
> But it may be fine.
> The hurdle I faced with devlink resource is with defining the granularity.
>
> For example one devlink instance deploys sub functions on multiple pci functions.
> So how to name them? Currently we have controller and PFs in port annotation.
> So resource name as
> c0pf0_subfunctions -> for controller 0, pf 0
> c1pf2_subfunctions -> for controller 1, pf 2
>
> Couldn't convince my self to name it this way.
>
> Below example looked simpler to use but plumbing doesn’t exist for it.
>
> $ devlink resource show pci/0000:03:00.0
> pci/0000:03:00.0/1: name max_sfs count 256 controller 0 pf 0
> pci/0000:03:00.0/2: name max_sfs count 100 controller 1 pf 0
> pci/0000:03:00.0/3: name max_sfs count 64 controller 1 pf 1
>
> $ devlink resource set pci/0000:03:00.0/1 max_sfs 100
>
> Second option I was considering was use port params which doesn't sound so right as resource.
>
>>
>> Also, seems like there are hardware constraint at play. e.g., can a user reduce
>> the number of queues used by the physical function to support more sub-
>> functions? If so how does a user programmatically learn about this limitation?
>> e.g., devlink could have support to show resource sizing and configure
>> constraints similar to what mlxsw has.
> Yes, need to figure out its naming. For mlx5 num queues doesn't have relation to subfunctions.
> But PCI resource has relation and this is something we want to do in future, as you said may be using devlink resource.
>
With Connectx-4 Lx for example the netdev can have at most 63 queues
leaving 96 cpu servers a bit short - as an example of the limited number
of queues that a nic can handle (or currently exposes to the OS not sure
which). If I create a subfunction for ethernet traffic, how many queues
are allocated to it by default, is it managed via ethtool like the pf
and is there an impact to the resources used by / available to the
primary device?
Powered by blists - more mailing lists