lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z71aU3ZOQf2xGHp_@U-2FWC9VHC-2323.local>
Date: Tue, 25 Feb 2025 13:51:15 +0800
From: Feng Tang <feng.tang@...ux.alibaba.com>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, Lukas Wunner <lukas@...ner.de>,
	Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@...ux.intel.com>,
	Liguang Zhang <zhangliguang@...ux.alibaba.com>,
	Guanghui Feng <guanghuifeng@...ux.alibaba.com>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	Markus Elfring <Markus.Elfring@....de>, lkp@...el.com,
	Jonathan Cameron <Jonathan.Cameron@...wei.com>,
	linux-pci@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 2/4] PCI/portdrv: Add necessary wait for disabling
 hotplug events

Hi Ilpo, 

On Mon, Feb 24, 2025 at 05:00:03PM +0200, Ilpo Järvinen wrote:
> On Mon, 24 Feb 2025, Feng Tang wrote:
> 
> > There was problem reported by firmware developers that they received
> > two PCIe hotplug commands in very short intervals on an ARM server,
> > which doesn't comply with PCIe spec, and broke their state machine and
> > work flow. According to PCIe 6.1 spec, section 6.7.3.2, software needs
> > to wait at least 1 second for the command-complete event, before
> > resending the command or sending a new command.
> > 
> > In the failure case, the first PCIe hotplug command firmware received
> > is from get_port_device_capability(), which sends command to disable
> > PCIe hotplug interrupts without waiting for its completion, and the
> > second command comes from pcie_enable_notification() of pciehp driver,
> > which enables hotplug interrupts again.
> > 
> > Fix it by adding the necessary wait to comply with PCIe spec.
> > 
> > Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
> > Originally-by: Liguang Zhang <zhangliguang@...ux.alibaba.com>
> > Signed-off-by: Feng Tang <feng.tang@...ux.alibaba.com>
> > ---
> >  drivers/pci/pci.h          |  2 ++
> >  drivers/pci/pcie/portdrv.c | 19 +++++++++++++++++--
> >  2 files changed, 19 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index 4c94a589de4a..a1138ebc2689 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -760,6 +760,7 @@ static inline void pcie_ecrc_get_policy(char *str) { }
> >  void pcie_reset_lbms_count(struct pci_dev *port);
> >  int pcie_lbms_count(struct pci_dev *port, unsigned long *val);
> >  int pcie_poll_sltctl_cmd(struct pci_dev *dev, int timeout_ms);
> > +void pcie_disable_hp_interrupts_early(struct pci_dev *dev);
> >  #else
> >  static inline void pcie_reset_lbms_count(struct pci_dev *port) {}
> >  static inline int pcie_lbms_count(struct pci_dev *port, unsigned long *val)
> > @@ -770,6 +771,7 @@ static inline int pcie_poll_sltctl_cmd(struct pci_dev *dev, int timeout_ms)
> >  {
> >  	return 0;
> >  }
> > +static inline void pcie_disable_hp_interrupts_early(struct pci_dev *dev) {}
> >  #endif
> >  
> >  struct pci_dev_reset_methods {
> > diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> > index bb00ba45ee51..ca4f21dff486 100644
> > --- a/drivers/pci/pcie/portdrv.c
> > +++ b/drivers/pci/pcie/portdrv.c
> > @@ -230,6 +230,22 @@ int pcie_poll_sltctl_cmd(struct pci_dev *dev, int timeout_ms)
> >  	return  ret;
> >  }
> >  
> > +void pcie_disable_hp_interrupts_early(struct pci_dev *dev)
> > +{
> > +	u16 slot_ctrl = 0;
> 
> Unnecessary initialization
 
The reason I put it to 0 is, very unlikely, the pcie_capability_read_word()
might fail, and 0 can avoid the early return.

> > +
> > +	pcie_capability_read_word(dev, PCI_EXP_SLTCTL, &slot_ctrl);
> > +	/* Bail out early if it is already disabled */
> > +	if (!(slot_ctrl & (PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE)))
> > +		return;
> > +
> > +	pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> > +		  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> 
> Align to (. You might need to put the bits to own lines.

OK.

> > +
> > +	if (pcie_poll_sltctl_cmd(dev, 1000))
> > +		pci_info(dev, "Timeout on disabling PCIe hot-plug interrupt\n");
> > +}
> > +
> >  /**
> >   * get_port_device_capability - discover capabilities of a PCI Express port
> >   * @dev: PCI Express port to examine
> > @@ -255,8 +271,7 @@ static int get_port_device_capability(struct pci_dev *dev)
> >  		 * Disable hot-plug interrupts in case they have been enabled
> >  		 * by the BIOS and the hot-plug service driver is not loaded.
> >  		 */
> > -		pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
> > -			  PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
> > +		pcie_disable_hp_interrupts_early(dev);
> 
> Doesn't calling this here delay setup for all portdrv services, not just 
> hotplug? And the delay can be relatively long.

I don't quite follow, physically there is only one root port, the code
from commit 2bd50dd800b5 just does it once. Also the 1 second is just the
maxim waiting time, the wait will end once the command completed event
bit is set. The exact time depends on platform (x86 and ARM) and the
firmeware implementation, and it is much smaller than 1 second AFAIK.

Thanks,
Feng

> >  	}
> >  
> >  #ifdef CONFIG_PCIEAER
> > 
> 
> -- 
>  i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ