lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFv23Q=HqyL4EGjG0VdsQH9rP0_DbRdpExbeJy6DAoKQ0OMbkA@mail.gmail.com>
Date: Tue, 22 Oct 2024 21:05:39 +0800
From: AceLan Kao <acelan.kao@...onical.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>, 
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: pciehp: Fix system hang on resume after hot-unplug
 during suspend

AceLan Kao <acelan.kao@...onical.com> 於 2024年10月17日 週四 上午10:40寫道:
>
> AceLan Kao <acelan.kao@...onical.com> 於 2024年10月7日 週一 下午12:34寫道:
> >
> > Lukas Wunner <lukas@...ner.de> 於 2024年10月1日 週二 下午7:03寫道:
> > >
> > > On Tue, Oct 01, 2024 at 01:02:46PM +0200, Lukas Wunner wrote:
> > > > On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote:
> > > > > Lukas Wunner <lukas@...ner.de> 2024 9 28 8:51:
> > > > > > -       if (pci_get_dsn(pdev) != ctrl->dsn)
> > > > > > +       dsn = pci_get_dsn(pdev);
> > > > > > +       if (!PCI_POSSIBLE_ERROR(dsn) &&
> > > > > > +           dsn != ctrl->dsn)
> > > > > >                 return true;
> > > > >
> > > > > In my case, the pciehp_device_replaced() returns true from this final check.
> > > > > And these are the values I got
> > > > > dsn = 0x00000000, ctrl->dsn = 0x7800AA00
> > > > > dsn = 0x00000000, ctrl->dsn = 0x21B7D000
> > > >
> > > > Ah because pci_get_dsn() returns 0 if the device is gone.
> > > > Below is a modified patch which returns false in that case.
> > >
> > > Sorry, forgot to include the patch:
> > >
> > > -- >8 --
> > >
> > > diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
> > > index ff458e6..957c320 100644
> > > --- a/drivers/pci/hotplug/pciehp_core.c
> > > +++ b/drivers/pci/hotplug/pciehp_core.c
> > > @@ -287,24 +287,32 @@ static int pciehp_suspend(struct pcie_device *dev)
> > >  static bool pciehp_device_replaced(struct controller *ctrl)
> > >  {
> > >         struct pci_dev *pdev __free(pci_dev_put);
> > > +       u64 dsn;
> > >         u32 reg;
> > >
> > >         pdev = pci_get_slot(ctrl->pcie->port->subordinate, PCI_DEVFN(0, 0));
> > >         if (!pdev)
> > > +               return false;
> > > +
> > > +       if (pci_read_config_dword(pdev, PCI_VENDOR_ID, &reg) == 0 &&
> > > +           !PCI_POSSIBLE_ERROR(reg) &&
> > > +           reg != (pdev->vendor | (pdev->device << 16)))
> > >                 return true;
> > >
> > > -       if (pci_read_config_dword(pdev, PCI_VENDOR_ID, &reg) ||
> > > -           reg != (pdev->vendor | (pdev->device << 16)) ||
> > > -           pci_read_config_dword(pdev, PCI_CLASS_REVISION, &reg) ||
> > > +       if (pci_read_config_dword(pdev, PCI_CLASS_REVISION, &reg) == 0 &&
> > > +           !PCI_POSSIBLE_ERROR(reg) &&
> > >             reg != (pdev->revision | (pdev->class << 8)))
> > >                 return true;
> > >
> > >         if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL &&
> > > -           (pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, &reg) ||
> > > -            reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16))))
> > > +           pci_read_config_dword(pdev, PCI_SUBSYSTEM_VENDOR_ID, &reg) == 0 &&
> > > +           !PCI_POSSIBLE_ERROR(reg) &&
> > > +           reg != (pdev->subsystem_vendor | (pdev->subsystem_device << 16)))
> > >                 return true;
> > >
> > > -       if (pci_get_dsn(pdev) != ctrl->dsn)
> > > +       if ((dsn = pci_get_dsn(pdev)) &&
> > > +           !PCI_POSSIBLE_ERROR(dsn) &&
> > > +           dsn != ctrl->dsn)
> > >                 return true;
> > >
> > >         return false;
> > Hi Lukas,
> >
> > Sorry for the late reply, just encountered a strong typhoon in my area
> > last week and can't check this in our lab.
> >
> > The patched pciehp_device_replaced() returns false at the end of the
> > function in my case.
> > Unplug the dock which is connected with a tbt storage won't be
> > considered as a replacement.
> >
> > But when I tried to replace the dock with the tbt storage during
> > suspend, it still returned false at the end of the function like
> > unplugged.
> >
> > BTW, it's a new model, so I think the ICM is used. And the reg is
> > 0xffffffff when unplugged.
> Hi Lukas,
>
> PCI_POSSIBLE_ERROR() always returns true no matter the device is
> replaced or unplugged
> It seems difficult to distinguish between when a device is replaced
> and when it's unplugged.
>
> Do you have any ideas to fix the issue?
> This issue is severe to me, because the system hangs almost everytime
> when daisy chain tbt devices are unplugged when suspended.
> Thanks.
Hi Lukas,

I just submitted the v2, please help to review, thanks.
https://lore.kernel.org/linux-kernel/20241022130243.263737-1-acelan.kao@canonical.com/T/#u

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ