netdev - Re: [PATCH v2] net/tg3: fix race condition in tg3_reset

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <528e0a13-1b34-403e-9470-531c3fe677fa@linux.vnet.ibm.com>
Date: Thu, 16 Nov 2023 08:41:55 -0600
From: Thinh Tran <thinhtr@...ux.vnet.ibm.com>
To: Michael Chan <michael.chan@...adcom.com>
Cc: netdev@...r.kernel.org, siva.kallam@...adcom.com, prashant@...adcom.com,
        mchan@...adcom.com, pavan.chebbi@...adcom.com, drc@...ux.vnet.ibm.com,
        venkata.sai.duggi@....com
Subject: Re: [PATCH v2] net/tg3: fix race condition in tg3_reset_task()

I'll re-post the V2 patch shortly.
Thanks for the review.

Thinh Tran
On 11/15/2023 12:56 PM, Michael Chan wrote:
> On Wed, Nov 15, 2023 at 10:23 AM Thinh Tran <thinhtr@...ux.vnet.ibm.com> wrote:
>>
>>
>> On 11/14/2023 3:03 PM, Michael Chan wrote:
>>>
>>> Could you provide more information about the crashes?  The
>>> dev_watchdog() code already checks for netif_device_present() and
>>> netif_running() and netif_carrier_ok() before proceeding to check for
>>> TX timeout.  Why would adding some additional checks for PCI errors
>>> cause problems?  Of course the additional checks should only be done
>>> on PCI devices only.  Thanks.
>>
>> The checking for PCI errors is not the problem, avoiding calling drivers
>> ->ndo_tx_timeout() function, causing some issue.
> 
> I see.  By skipping TX timeout during PCI errors, bnx2x crashes in
> .ndo_start_xmit() after EEH error recovery.
> 
> I think it should be fine to fix the original EEH issue in tg3 then.
> Please re-post the tg3 patch.  Thanks.