netdev - Re: [PATCH net 2/3] ionic: no double destroy workqueue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a8faf111-281a-450e-b595-ba35a7ccc66d@amd.com>
Date: Tue, 10 Dec 2024 13:44:30 -0800
From: "Nelson, Shannon" <shannon.nelson@....com>
To: Jacob Keller <jacob.e.keller@...el.com>, netdev@...r.kernel.org,
 davem@...emloft.net, kuba@...nel.org, edumazet@...gle.com,
 pabeni@...hat.com, andrew+netdev@...n.ch
Cc: brett.creeley@....com
Subject: Re: [PATCH net 2/3] ionic: no double destroy workqueue

On 12/10/2024 1:02 PM, Jacob Keller wrote:
> On 12/10/2024 9:48 AM, Shannon Nelson wrote:
>> There are some FW error handling paths that can cause us to
>> try to destroy the workqueue more than once, so let's be sure
>> we're checking for that.
>>
>> The case where this popped up was in an AER event where the
>> handlers got called in such a way that ionic_reset_prepare()
>> and thus ionic_dev_teardown() got called twice in a row.
>> The second time through the workqueue was already destroyed,
>> and destroy_workqueue() choked on the bad wq pointer.
>>
>> We didn't hit this in AER handler testing before because at
>> that time we weren't using a private workqueue.  Later we
>> replaced the use of the system workqueue with our own private
>> workqueue but hadn't rerun the AER handler testing since then.
>>
>> Fixes: 9e25450da700 ("ionic: add private workqueue per-device")
>> Signed-off-by: Shannon Nelson <shannon.nelson@....com>
>> ---
>>   drivers/net/ethernet/pensando/ionic/ionic_dev.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.c b/drivers/net/ethernet/pensando/ionic/ionic_dev.c
>> index 9e42d599840d..57edcde9e6f8 100644
>> --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.c
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.c
>> @@ -277,7 +277,10 @@ void ionic_dev_teardown(struct ionic *ionic)
>>        idev->phy_cmb_pages = 0;
>>        idev->cmb_npages = 0;
>>
>> -     destroy_workqueue(ionic->wq);
>> +     if (ionic->wq) {
>> +             destroy_workqueue(ionic->wq);
>> +             ionic->wq = NULL;
>> +     }
> 
> This seems like you still could race if two threads call
> ionic_dev_teardown twice. Is that not possible due to some other
> synchronization mechanism?

Good question.  Thanks for looking at this and the other patches.

This is not a race thing so much as an already-been-here thing.  This 
function is only called by the probe, remove, and reset_prepare threads, 
all driven as PCI calls.  I'm reasonably sure that they won't be called 
my simultaneous threads, so we just need to be sure that we don't break 
if reset_prepare and remove get called one after the other because some 
PCI bus element got removed by surprise.

sln


> 
> Thanks,
> Jake
> 
>>        mutex_destroy(&idev->cmb_inuse_lock);
>>   }
>>
>