[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAd53p4ej6FA0dM4ahngnGAccSLe=-bsUVkt-eGb2jZkms1UYA@mail.gmail.com>
Date: Fri, 21 Jan 2022 20:31:27 +0800
From: Kai-Heng Feng <kai.heng.feng@...onical.com>
To: Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc: Bjorn Helgaas <helgaas@...nel.org>, bhelgaas@...gle.com,
koba.ko@...onical.com, Lukas Wunner <lukas@...ner.de>,
Stuart Hayes <stuart.w.hayes@...il.com>,
Jan Kiszka <jan.kiszka@...mens.com>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI/portdrv: Skip enabling AER on external facing ports
Hi Mika,
On Fri, Jan 21, 2022 at 6:55 PM Mika Westerberg
<mika.westerberg@...ux.intel.com> wrote:
>
> Hi Kai-Heng,
>
> On Fri, Jan 07, 2022 at 12:09:57PM +0800, Kai-Heng Feng wrote:
> > Only from root ports of thunderbolt devices.
> >
> > The error occurs as soon as the root port is runtime suspended to D3cold.
> >
> > Runtime suspend the AER service can resolve the issue. I wonder if
> > it's the right thing to do here?
>
> I think you are right here. It seems that AER "service driver" is
> completely missing PM hooks. Probably because it is more used in server
> type of systems where power management is not priority.
Here is my previous attempt to suspend AER:
https://lore.kernel.org/linux-pci/20210127173101.446940-1-kai.heng.feng@canonical.com/
>
> > D3cold should also mean the PCI link is gone, disabling AER seems to
> > be a reasonable approach.
>
> Indeed - I think AER might trigger here because the link does "down" /
> low power state if left enabled while the root port enters D3. Something
> like below hack should disable it over low power transitions:
Ubuntu kernel has been carrying the patch for quite some time:
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/unstable/commit/?id=e82f15f1a26273b004054a81ef45937fb1b632e5
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 9fa1f97e5b27..64138cf82db8 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1432,6 +1432,22 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
> }
>
> +static int aer_suspend(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> +
> + aer_disable_rootport(rpc);
> + return 0;
> +}
> +
> +static int aer_resume(struct pcie_device *dev)
> +{
> + struct aer_rpc *rpc = get_service_data(dev);
> +
> + aer_enable_rootport(rpc);
> + return 0;
> +}
> +
> static struct pcie_port_service_driver aerdriver = {
> .name = "aer",
> .port_type = PCIE_ANY_PORT,
> @@ -1439,6 +1455,10 @@ static struct pcie_port_service_driver aerdriver = {
>
> .probe = aer_probe,
> .remove = aer_remove,
> + .suspend = aer_suspend,
> + .resume = aer_resume,
> + .runtime_suspend = aer_suspend,
> + .runtime_resume = aer_resume,
> };
This patch is exactly what I tested.
Maybe only suspend/runtime_suspend AER when the target PM state is D3cold?
PCIe spec doesn't say how to handle AER in Link L2/L3Ready/L3, but I
think it's reasonable to suspend AER when power is loss.
Let me come up with a patch with that idea.
Kai-Heng
>
> /**
Powered by blists - more mailing lists