[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Yz758wQfWAXADcpl@nanopsycho>
Date: Thu, 6 Oct 2022 17:53:23 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: "Lucero Palau, Alejandro" <alejandro.lucero-palau@....com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"dmichail@...gible.com" <dmichail@...gible.com>,
"jesse.brandeburg@...el.com" <jesse.brandeburg@...el.com>,
"anthony.l.nguyen@...el.com" <anthony.l.nguyen@...el.com>,
"snelson@...sando.io" <snelson@...sando.io>,
"drivers@...sando.io" <drivers@...sando.io>,
"f.fainelli@...il.com" <f.fainelli@...il.com>,
"yangyingliang@...wei.com" <yangyingliang@...wei.com>
Subject: Re: [patch net-next 0/3] devlink: fix order of port and netdev
register in drivers
Thu, Oct 06, 2022 at 03:45:48PM CEST, alejandro.lucero-palau@....com wrote:
>
>On 10/6/22 14:44, Jiri Pirko wrote:
>> Wed, Oct 05, 2022 at 10:18:29AM CEST, alejandro.lucero-palau@....com wrote:
>>> On 10/5/22 09:49, Jiri Pirko wrote:
>>>> Tue, Oct 04, 2022 at 05:31:10PM CEST, alejandro.lucero-palau@....com wrote:
>>>>> Hi Jiri,
>>>> I don't understand why you send this as a reply to this patchset. I
>>>> don't see the relation to it.
>>> I thought there was a relationship with ordering being the issue.
>>>
>>> Apologies if this is not the right way for rising my concern.
>>>
>>>
>>>>> I think we have another issue with devlink_unregister and related
>>>>> devlink_port_unregister. It is likely not an issue with current drivers
>>>>> because the devlink ports are managed by netdev register/unregister
>>>>> code, and with your patch that will be fine.
>>>>>
>>>>> But by definition, devlink does exist for those things not matching
>>>>> smoothly to netdevs, so it is expected devlink ports not related to
>>>>> existing netdevs at all. That is the case in a patch I'm working on for
>>>>> sfc ef100, where devlink ports are created at PF initialization, so
>>>>> related netdevs will not be there at that point, and they can not exist
>>>>> when the devlink ports are removed when the driver is removed.
>>>>>
>>>>> So the question in this case is, should the devlink ports unregister
>>>>> before or after their devlink unregisters?
>>>> Before. If devlink instance should be unregistered only after all other
>>>> related instances are gone.
>>>>
>>>> Also, the devlink ports come and go during the devlink lifetime. When
>>>> you add a VF, split a port for example. There are many other cases.
>>>>
>>>>
>>>>> Since the ports are in a list owned by the devlink struct, I think it
>>>>> seems logical to unregister the ports first, and that is what I did. It
>>>>> works but there exists a potential concurrency issue with devlink user
>>>> What concurrency issue are you talking about?
>>>>
>>> 1) devlink port function set ...
>>>
>>> 2) predoit inside devlink obtains devlink then the reference to devlink
>>> port. Code does a put on devlink but not on the devlink port.
>> devl_lock is taken here.
>
>This is embarrassing.
>
>Somehow I misread the code assuming the protection was only based on the
>get operation, that the devlink lock was released there and not in the
>post_doit.
>
>That goto unlock confused me, I guess, along with a bias looking for
>ordering issues.
>
>Apologies.
Np :) Happy to help.
>
>Happy to see all is fine.
>
>Thank you.
>
>>
>>> 3) driver is removed. devlink port is removed. devlink is not because
>> devl_lock taken before port is removed and will block there.
>>
>> I don't see any problem. Did you actually encoutered any problem?
>>
>>
>>> the put.
>>>
>>> 4) devlink port reference is wrong.
>>>
>>>
>>>>> space operations. The devlink code takes care of race conditions involving the
>>>>> devlink struct with rcu plus get/put operations, but that is not the
>>>>> case for devlink ports.
>>>>>
>>>>> Interestingly, unregistering the devlink first, and doing so with the
>>>>> ports without touching/releasing the devlink struct would solve the
>>>>> problem, but not sure this is the right approach here. It does not seem
>>>> It is not. As I wrote above, the devlink ports come and go.
>>>>
>>>>
>>>>> clean, and it would require documenting the right unwinding order and
>>>>> to add a check for DEVLINK_REGISTERED in devlink_port_unregister.
>>>>>
>>>>> I think the right solution would be to add protection to devlink ports
>>>>> and likely other devlink objects with similar concurrency issues.
>>>>>
>>>>>
>>>>> Let me know what you think about it.
>>>>>
>>>>>
>>>>>
>>>>> On 9/26/22 13:09, Jiri Pirko wrote:
>>>>>> CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>>>>>>
>>>>>>
>>>>>> From: Jiri Pirko <jiri@...dia.com>
>>>>>>
>>>>>> Some of the drivers use wrong order in registering devlink port and
>>>>>> netdev, registering netdev first. That was not intended as the devlink
>>>>>> port is some sort of parent for the netdev. Fix the ordering.
>>>>>>
>>>>>> Note that the follow-up patchset is going to make this ordering
>>>>>> mandatory.
>>>>>>
>>>>>> Jiri Pirko (3):
>>>>>> funeth: unregister devlink port after netdevice unregister
>>>>>> ice: reorder PF/representor devlink port register/unregister flows
>>>>>> ionic: change order of devlink port register and netdev register
>>>>>>
>>>>>> .../net/ethernet/fungible/funeth/funeth_main.c | 2 +-
>>>>>> drivers/net/ethernet/intel/ice/ice_lib.c | 6 +++---
>>>>>> drivers/net/ethernet/intel/ice/ice_main.c | 12 ++++++------
>>>>>> drivers/net/ethernet/intel/ice/ice_repr.c | 2 +-
>>>>>> .../net/ethernet/pensando/ionic/ionic_bus_pci.c | 16 ++++++++--------
>>>>>> 5 files changed, 19 insertions(+), 19 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.37.1
>>>>>>
>
Powered by blists - more mailing lists