[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7VHuOFxilan4LX_@U-2FWC9VHC-2323.local>
Date: Wed, 19 Feb 2025 10:53:44 +0800
From: Feng Tang <feng.tang@...ux.alibaba.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, Lukas Wunner <lukas@...ner.de>,
Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@...ux.intel.com>,
Liguang Zhang <zhangliguang@...ux.alibaba.com>,
Guanghui Feng <guanghuifeng@...ux.alibaba.com>, rafael@...nel.org,
Markus Elfring <Markus.Elfring@....de>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
ilpo.jarvinen@...ux.intel.com, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] PCI/portdrv: Add necessary wait for disabling
hotplug events
Hi Bjorn Helgaas,
Thanks for the review!
On Tue, Feb 18, 2025 at 04:33:54PM -0600, Bjorn Helgaas wrote:
> On Tue, Feb 18, 2025 at 11:48:58AM +0800, Feng Tang wrote:
> > There was problem reported by firmware developers that they received
> > 2 pcie link control commands in very short intervals on an ARM server,
> > which doesn't comply with pcie spec, and broke their state machine and
> > work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs
> > to wait at least 1 second for the command-complete event, before
> > resending the cmd or sending a new cmd.
>
> s/link control/hotplug/ (also below)
> s/2/two/
> s/pcie/PCIe/ (also below)
Will follow.
> > And the first link control command firmware received is from
> > get_port_device_capability(), which sends cmd to disable pcie hotplug
> > interrupts without waiting for its completion.
> >
> > Fix it by adding the necessary wait to comply with PCIe spec, referring
> > pcie_poll_cmd().
> >
> > Also make the interrupt disabling not dependent on whether pciehp
> > service driver will be loaded as suggested by Lukas.
>
> This sounds like maybe it should be two separate patches.
OK.
[...]
> >
> > +static int pcie_wait_sltctl_cmd_raw(struct pci_dev *dev)
> > +{
> > + u16 slot_status = 0;
> > + int ret, ret1, timeout_us;
> > +
> > + /* 1 second, according to PCIe spec 6.1, section 6.7.3.2 */
> > + timeout_us = 1000000;
> > + ret = read_poll_timeout(pcie_capability_read_word, ret1,
> > + (slot_status & PCI_EXP_SLTSTA_CC), 10000,
> > + timeout_us, true, dev, PCI_EXP_SLTSTA,
> > + &slot_status);
> > + if (!ret)
> > + pcie_capability_write_word(dev, PCI_EXP_SLTSTA,
> > + PCI_EXP_SLTSTA_CC);
> > +
> > + return ret;
>
> Ugh. I really don't like the way this basically duplicates
> pcie_poll_cmd(). I don't have a great suggestion to fix it; maybe we
> need a way to build part of pciehp unconditionally.
Yes, I also thought about this. One idea is to unify the two functions,
and let pcie_poll_cmd() reuse this pcie_wait_sltctl_cmd_raw() here in
portdrv.c. As CONFIG_HOTPLUG_PCI_PCIE depends on CONFIG_PCIEPORTBUS,
there should be no dependency issue. How do you think?
> At the very least
> we need a comment here pointing to pcie_poll_cmd().
Aha, I mentioned 'referring pcie_poll_cmd()' in commit log, but forgot
to add it here.
>
> And IIUC this will add a one second delay for ports that don't need
> command completed events. I don't think that's fair to those ports.
Good catch! So we should add a read of PCI_EXP_SLTCAP register and
check if PCI_EXP_SLTCAP_HPC bit is set.
> > +}
> > +
> > +void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
> > +{
> > + u16 slot_ctrl = 0;
> > +
> > + pcie_capability_read_word(dev, PCI_EXP_SLTCTL, &slot_ctrl);
> > + /* Bail out early if it is already disabled */
> > + if (!(slot_ctrl & (PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE)))
> > + return;
> > +
> > + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> > + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> > +
> > + if (pcie_wait_sltctl_cmd_raw(dev))
> > + pci_info(dev, "Timeout on disabling PCIE hot-plug interrupt\n");
>
> s/PCIE/PCIe/
Will change.
Thanks,
Feng
Powered by blists - more mailing lists