[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <71f2f2cc-44e7-c315-2005-0b23c8f812af@amd.com>
Date: Tue, 21 Feb 2023 14:03:24 -0800
From: Shannon Nelson <shannon.nelson@....com>
To: Leon Romanovsky <leon@...nel.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org,
drivers@...sando.io, brett.creeley@....com
Subject: Re: [PATCH v3 net-next 00/14] pds_core driver
On 2/20/23 10:53 PM, Leon Romanovsky wrote:
> On Mon, Feb 20, 2023 at 03:53:02PM -0800, Shannon Nelson wrote:
>> On 2/19/23 2:11 AM, Leon Romanovsky wrote:
>>> On Fri, Feb 17, 2023 at 02:55:44PM -0800, Shannon Nelson wrote:
>>>> Summary:
>>>> --------
>>>> This patchset implements new driver for use with the AMD/Pensando
>>>> Distributed Services Card (DSC), intended to provide core configuration
>>>> services through the auxiliary_bus for VFio and vDPA feature specific
>>>> drivers.
>>>
>>> Hi,
>>>
>>> I didn't look very deeply to this series, but three things caught my
>>> attention and IMHO they need to be changed/redesinged before someone
>>> can consider to merge it.
>>>
>>> 1. Use of bus_register_notifier to communicate between auxiliary devices.
>>> This whole concept makes aux logic in this driver looks horrid. The idea
>>> of auxiliary bus is separate existing device to sub-devices, while every
>>> such sub-device is controlled through relevant subsystem. Current
>>> implementation challenges this assumption by inventing own logic.
>>> 2. devm_* interfaces. It is bad. Please don't use them in such a complex
>>> driver.
>>> 3. Listen to PCI BOUND_DRIVER event can't be correct either.
>>>
>>> Thanks
>>
>> Hi Leon,
>>
>> Thanks for your comments. I’ll respond to 1 and 3 together.
>>
>>> 1. Use of bus_register_notifier to communicate between auxiliary devices.
>>> 3. Listen to PCI BOUND_DRIVER event can't be correct either
>>
>> We’re only using the notifier for the core driver to know when to create and
>> destroy auxiliary devices, not for communicate between auxiliary devices or
>> drivers – I agree, that would be ugly.
>>
>> We want to create the auxiliary device after a VF PCI device is bound to a
>> driver (BUS_NOTIFY_BOUND_DRIVER), and remove that auxiliary device just
>> before a VF is unbound from its PCI driver (BUS_NOTIFY_UNBIND_DRIVER);
>> bus_notify_register gives us access to these messages. I believe this is
>> not too far off from other examples that can be seen in
>> vfio/pci/vfio_pci_core, net/ethernet/ibm/emac, and net/mctp.
>>
>> Early on we were creating and deleting the auxiliary devices at the same
>> time as calling sriov_enable() and sriov_disable() but found that when the
>> user was doing unbind/bind operations we could end up with in-operative
>> clients and occasional kernel panics because the devices and drivers were
>> out-of-sync. Listening for the bind/unbind events allows the pds_core
>> driver to make sure the auxiliary devices serving each of the VF drivers are
>> kept in sync with the state of the VF PCI drivers.
>
> Auxiliary devices are intended to statically re-partition existing physical devices.
> You are not supposed to create such devices on the fly. So once you create VF physical
> device, it should already have aux device created too.
>
> This is traditional driver device model which you should follow and not
> reinvent your own schema, just to overcome bugs.
Not so much a “bug”, I think, as an incomplete model which we addressed
by using a hotplug model with the aux-devices. We felt we were
following a part of the device model, but perhaps from the wrong angle:
hot-plugging the aux-devices, which worked nicely to simplify the
aux-driver/pci-driver relationship. However, I think the hole we fell
into was expecting the client to be tied to a pci device on the other
end, and this shouldn’t be a factor in the core’s management of the aux
device.
So, the aux-device built by the core needs to remain the constant, which
is what we originally had: the aux devices created at pci_enable_sriov()
time and torn down at pci_disable_sriov(). It’s easy enough to go back
to that model, and that makes sense.
Meanwhile, for those clients that do rely on the existence of a pci
device, would you see that as a proper use of a bus listener to have the
aux-driver subscribe and listen for its bind/unbind events?
>
> Additionally, it is unclear how much we should invest in this review
> given the fact that pds VFIO series was NACKed.
We haven’t given up on that one yet, and we have a vDPA client in mind
as well possibly one or two others.
>
>>
>>
>>> 2. devm_* interfaces
>>
>> Can you elaborate a bit on why you see using the devm interfaces as bad? I
>> don’t see the code complexity being any different than using the non-devm
>> interfaces, and these give the added protection of making sure to not leak
>> resources on driver removals, whether clean or on error. We are using these
>> allocations for longer term driver structures, not for ephemeral short-term
>> use buffers or fast-path operations, so the overhead should remain minimal.
>> It seems to me this is the advertised use as described in the devres notes
>> under Documentation.
>
> We had a very interesting talk in last LPC and overall agreement across
> various subsystem maintainers was to stay away from devm_* interfaces.
> https://lpc.events/event/16/contributions/1227/
>
> We prefer explicit nature of kzalloc/kfree instead of implicit approach
> of devm_kzalloc.
Interesting – thanks, I missed that presentation. Sure, we can pull
back on this.
Thanks for your comments.
sln
Powered by blists - more mailing lists