[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874jm6nsd0.fsf@intel.com>
Date: Fri, 14 Jul 2023 13:35:55 -0700
From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
To: Bjorn Helgaas <helgaas@...nel.org>, Kai-Heng Feng
<kai.heng.feng@...onical.com>
Cc: jesse.brandeburg@...el.com, anthony.l.nguyen@...el.com,
linux-pci@...r.kernel.org, "Guilherme G . Piccoli" <gpiccoli@...lia.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, Kees Cook <keescook@...omium.org>, Tony Luck
<tony.luck@...el.com>, intel-wired-lan@...ts.osuosl.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-hardening@...r.kernel.org
Subject: Re: [PATCH v2] igc: Ignore AER reset when device is suspended
Bjorn Helgaas <helgaas@...nel.org> writes:
> On Fri, Jul 14, 2023 at 01:05:41PM +0800, Kai-Heng Feng wrote:
>> When a system that connects to a Thunderbolt dock equipped with I225,
>> like HP Thunderbolt Dock G4, I225 stops working after S3 resume:
>> ...
>
>> The issue is that the PTM requests are sending before driver resumes the
>> device. Since the issue can also be observed on Windows, it's quite
>> likely a firmware/hardware limitation.
>
> Does this mean we didn't disable PTM correctly on suspend? Or is the
> device defective and sending PTM requests even though PTM is disabled?
>
The way I understand the hardware bug, the device is defective, as you
said, the device sends PTM messages when "busmastering" is disabled.
> If the latter, I vote for a quirk that just disables PTM completely
> for this device.
>
My suggestion is that adding this quirk would be a last resort kind of
thing. There are users/customers that depend on the increased time
synchronization accuracy that PTM provides.
> This check in .error_detected() looks out of place to me because
> there's no connection between AER and PTM, there's no connection
> between PTM and the device being enabled, and the connection between
> the device being enabled and being fully resumed is a little tenuous.
>
> If we must do it this way, maybe add a comment about *why* we're
> checking pci_is_enabled(). Otherwise this will be copied to other
> drivers that don't need it.
Makes total sense, from my side.
>
>> So avoid resetting the device if it's not resumed. Once the device is
>> fully resumed, the device can work normally.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216850
>> Reviewed-by: Guilherme G. Piccoli <gpiccoli@...lia.com>
>> Acked-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
>>
>> ---
>> v2:
>> - Fix typo.
>> - Mention the product name.
>>
>> drivers/net/ethernet/intel/igc/igc_main.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
>> index 9f93f0f4f752..8c36bbe5e428 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_main.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
>> @@ -7115,6 +7115,9 @@ static pci_ers_result_t igc_io_error_detected(struct pci_dev *pdev,
>> struct net_device *netdev = pci_get_drvdata(pdev);
>> struct igc_adapter *adapter = netdev_priv(netdev);
>>
>> + if (!pci_is_enabled(pdev))
>> + return 0;
>> +
>> netif_device_detach(netdev);
>>
>> if (state == pci_channel_io_perm_failure)
>> --
>> 2.34.1
>>
>
--
Vinicius
Powered by blists - more mailing lists