[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAd53p4Owt_ygt2f=38M0X2MxnPsXv=BHzSLRbprwW208MUVdQ@mail.gmail.com>
Date: Mon, 17 Jul 2023 15:38:09 +0800
From: Kai-Heng Feng <kai.heng.feng@...onical.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: jesse.brandeburg@...el.com, anthony.l.nguyen@...el.com,
linux-pci@...r.kernel.org,
"Guilherme G . Piccoli" <gpiccoli@...lia.com>,
Vinicius Costa Gomes <vinicius.gomes@...el.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Kees Cook <keescook@...omium.org>,
Tony Luck <tony.luck@...el.com>,
intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
Aaron Ma <aaron.ma@...onical.com>
Subject: Re: [PATCH v2] igc: Ignore AER reset when device is suspended
[+Cc Aaron]
On Fri, Jul 14, 2023 at 10:54 PM Bjorn Helgaas <helgaas@...nel.org> wrote:
>
> On Fri, Jul 14, 2023 at 01:05:41PM +0800, Kai-Heng Feng wrote:
> > When a system that connects to a Thunderbolt dock equipped with I225,
> > like HP Thunderbolt Dock G4, I225 stops working after S3 resume:
> > ...
>
> > The issue is that the PTM requests are sending before driver resumes the
> > device. Since the issue can also be observed on Windows, it's quite
> > likely a firmware/hardware limitation.
>
> Does this mean we didn't disable PTM correctly on suspend? Or is the
PTM gets disabled correctly during suspend, by commit c01163dbd1b8
("PCI/PM: Always disable PTM for all devices during suspend").
Before that commit the suspend will fail.
> device defective and sending PTM requests even though PTM is disabled?
Yes. When S3 resume, I guess the firmware resets the dock and/or I225
so PTM request starts even before the OS is resumed.
AFAIK the issue doesn't happen when s2Idle is used.
>
> If the latter, I vote for a quirk that just disables PTM completely
> for this device.
The S3 resume enables PTM regardless of OS involvement. So I don't
think this will work.
>
> This check in .error_detected() looks out of place to me because
> there's no connection between AER and PTM, there's no connection
> between PTM and the device being enabled, and the connection between
> the device being enabled and being fully resumed is a little tenuous.
True. This patch is just a workaround.
Have you considered my other proposed approach? Like disable AER
completely during suspend, or even defer the resuming of PCIe services
after the entire hierarchy is resumed?
>
> If we must do it this way, maybe add a comment about *why* we're
> checking pci_is_enabled(). Otherwise this will be copied to other
> drivers that don't need it.
Sure.
Kai-Heng
>
> > So avoid resetting the device if it's not resumed. Once the device is
> > fully resumed, the device can work normally.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216850
> > Reviewed-by: Guilherme G. Piccoli <gpiccoli@...lia.com>
> > Acked-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
> >
> > ---
> > v2:
> > - Fix typo.
> > - Mention the product name.
> >
> > drivers/net/ethernet/intel/igc/igc_main.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> > index 9f93f0f4f752..8c36bbe5e428 100644
> > --- a/drivers/net/ethernet/intel/igc/igc_main.c
> > +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> > @@ -7115,6 +7115,9 @@ static pci_ers_result_t igc_io_error_detected(struct pci_dev *pdev,
> > struct net_device *netdev = pci_get_drvdata(pdev);
> > struct igc_adapter *adapter = netdev_priv(netdev);
> >
> > + if (!pci_is_enabled(pdev))
> > + return 0;
> > +
> > netif_device_detach(netdev);
> >
> > if (state == pci_channel_io_perm_failure)
> > --
> > 2.34.1
> >
Powered by blists - more mailing lists