[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <230d76ef-5ca8-400d-8f59-406292025da3@intel.com>
Date: Wed, 11 Dec 2024 11:23:03 -0800
From: Jacob Keller <jacob.e.keller@...el.com>
To: "Nelson, Shannon" <shannon.nelson@....com>, <netdev@...r.kernel.org>,
<davem@...emloft.net>, <kuba@...nel.org>, <edumazet@...gle.com>,
<pabeni@...hat.com>, <andrew+netdev@...n.ch>
CC: <brett.creeley@....com>
Subject: Re: [PATCH net 2/3] ionic: no double destroy workqueue
On 12/10/2024 1:44 PM, Nelson, Shannon wrote:
> On 12/10/2024 1:02 PM, Jacob Keller wrote:
>> On 12/10/2024 9:48 AM, Shannon Nelson wrote:
>>> There are some FW error handling paths that can cause us to
>>> try to destroy the workqueue more than once, so let's be sure
>>> we're checking for that.
>>>
>>> The case where this popped up was in an AER event where the
>>> handlers got called in such a way that ionic_reset_prepare()
>>> and thus ionic_dev_teardown() got called twice in a row.
>>> The second time through the workqueue was already destroyed,
>>> and destroy_workqueue() choked on the bad wq pointer.
>>>
>>> We didn't hit this in AER handler testing before because at
>>> that time we weren't using a private workqueue. Later we
>>> replaced the use of the system workqueue with our own private
>>> workqueue but hadn't rerun the AER handler testing since then.
>>>
>>> Fixes: 9e25450da700 ("ionic: add private workqueue per-device")
>>> Signed-off-by: Shannon Nelson <shannon.nelson@....com>
>>> ---
>>> drivers/net/ethernet/pensando/ionic/ionic_dev.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.c b/drivers/net/ethernet/pensando/ionic/ionic_dev.c
>>> index 9e42d599840d..57edcde9e6f8 100644
>>> --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.c
>>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.c
>>> @@ -277,7 +277,10 @@ void ionic_dev_teardown(struct ionic *ionic)
>>> idev->phy_cmb_pages = 0;
>>> idev->cmb_npages = 0;
>>>
>>> - destroy_workqueue(ionic->wq);
>>> + if (ionic->wq) {
>>> + destroy_workqueue(ionic->wq);
>>> + ionic->wq = NULL;
>>> + }
>>
>> This seems like you still could race if two threads call
>> ionic_dev_teardown twice. Is that not possible due to some other
>> synchronization mechanism?
>
> Good question. Thanks for looking at this and the other patches.
>
> This is not a race thing so much as an already-been-here thing. This
> function is only called by the probe, remove, and reset_prepare threads,
> all driven as PCI calls. I'm reasonably sure that they won't be called
> my simultaneous threads, so we just need to be sure that we don't break
> if reset_prepare and remove get called one after the other because some
> PCI bus element got removed by surprise.
>
> sln
Ok. This is all serialized by the device/PCI layer then?
Makes sense.
Reviewed-by: Jacob Keller <jacob.e.keller@...el.com>
>
>
>>
>> Thanks,
>> Jake
>>
>>> mutex_destroy(&idev->cmb_inuse_lock);
>>> }
>>>
>>
>
Powered by blists - more mailing lists