[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f04da4a9-df6d-3002-ea10-12eaf2637331@intel.com>
Date: Wed, 18 Nov 2020 16:52:42 -0800
From: Jacob Keller <jacob.e.keller@...el.com>
To: Parav Pandit <parav@...dia.com>, David Ahern <dsahern@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
Cc: Jiri Pirko <jiri@...dia.com>, Jason Gunthorpe <jgg@...dia.com>,
"dledford@...hat.com" <dledford@...hat.com>,
Leon Romanovsky <leonro@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>,
"kuba@...nel.org" <kuba@...nel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
Vu Pham <vuhuong@...dia.com>
Subject: Re: [PATCH net-next 03/13] devlink: Support add and delete devlink
port
On 11/18/2020 9:02 AM, Parav Pandit wrote:
>
>> From: David Ahern <dsahern@...il.com>
>> Sent: Wednesday, November 18, 2020 9:51 PM
>>
>> On 11/12/20 12:24 PM, Parav Pandit wrote:
>>> Extended devlink interface for the user to add and delete port.
>>> Extend devlink to connect user requests to driver to add/delete such
>>> port in the device.
>>>
>>> When driver routines are invoked, devlink instance lock is not held.
>>> This enables driver to perform several devlink objects registration,
>>> unregistration such as (port, health reporter, resource etc) by using
>>> exising devlink APIs.
>>> This also helps to uniformly use the code for port unregistration
>>> during driver unload and during port deletion initiated by user.
>>>
>>> Examples of add, show and delete commands:
>>> $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
>>>
>>> $ devlink port show
>>> pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical
>>> port 0 splittable false
>>>
>>> $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
>>>
>>> $ devlink port show pci/0000:06:00.0/32768
>>> pci/0000:06:00.0/32768: type eth netdev eth0 flavour pcisf controller 0
>> pfnum 0 sfnum 88 external false splittable false
>>> function:
>>> hw_addr 00:00:00:00:88:88 state inactive opstate detached
>>>
>>
>> There has to be limits on the number of sub functions that can be created for
>> a device. How does a user find that limit?
> Yes, this came up internally, but didn't really converged.
> The devlink resource looked too verbose for an average or simple use cases.
> But it may be fine.
> The hurdle I faced with devlink resource is with defining the granularity.
>
> For example one devlink instance deploys sub functions on multiple pci functions.
> So how to name them? Currently we have controller and PFs in port annotation.
> So resource name as
> c0pf0_subfunctions -> for controller 0, pf 0
> c1pf2_subfunctions -> for controller 1, pf 2
>
> Couldn't convince my self to name it this way.
Yea, I think we need to extend the plumbing of resources to allow
specifying or assigning parent resources to a subfunction.
>
> Below example looked simpler to use but plumbing doesn’t exist for it.
>
> $ devlink resource show pci/0000:03:00.0
> pci/0000:03:00.0/1: name max_sfs count 256 controller 0 pf 0
> pci/0000:03:00.0/2: name max_sfs count 100 controller 1 pf 0
> pci/0000:03:00.0/3: name max_sfs count 64 controller 1 pf 1
>
> $ devlink resource set pci/0000:03:00.0/1 max_sfs 100
>
> Second option I was considering was use port params which doesn't sound so right as resource.
>
I don't think port parameters make sense here. They only encapsulate
single name -> value pairs, and don't really help show the relationships
between the subfunction ports and the parent device.
>>
>> Also, seems like there are hardware constraint at play. e.g., can a user reduce
>> the number of queues used by the physical function to support more sub-
>> functions? If so how does a user programmatically learn about this limitation?
>> e.g., devlink could have support to show resource sizing and configure
>> constraints similar to what mlxsw has.
> Yes, need to figure out its naming. For mlx5 num queues doesn't have relation to subfunctions.
> But PCI resource has relation and this is something we want to do in future, as you said may be using devlink resource.
>
I've been looking into queue management and being able to add and remove
queue groups and queues. I'm leaning towards building on top of devlink
resource for this.
Specifically I have been looking at picking up the work started by
Magnus last year, around creating interface for representing queues to
the stack better for AF_XDP, but it also has other possible uses.
I'd like to make sure it aligns with the ideas here for partitioning
resources. It seems like that should be best done at the devlink level,
where the main devlink instance knows about all the part limitations and
can then have new commands for allowing assignment of resources to ports.
Powered by blists - more mailing lists