[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <41285254-1bc1-3ffe-383e-276dc7193990@gmail.com>
Date: Mon, 20 Jan 2020 10:10:11 -0600
From: Stuart Hayes <stuart.w.hayes@...il.com>
To: Bjorn Helgaas <bhelgaas@...gle.com>
Cc: Austin Bolen <austin_bolen@...l.com>, keith.busch@...el.com,
Alexandru Gagniuc <mr.nuke.me@...il.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
Andy Shevchenko <andy.shevchenko@...il.com>,
"Gustavo A . R . Silva" <gustavo@...eddedor.com>,
Sinan Kaya <okaya@...nel.org>,
Oza Pawandeep <poza@...eaurora.org>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, lukas@...ner.de
Subject: Re: [PATCH v2] PCI: pciehp: Make sure pciehp_isr clears interrupt
events
On 11/20/19 4:20 PM, Stuart Hayes wrote:
> Without this patch, a pciehp hotplug port can stop generating interrupts
> on hotplug events, so device adds and removals will not be seen.
>
> The pciehp interrupt handler pciehp_isr() will read the slot status
> register and then write back to it to clear the bits that caused the
> interrupt. If a different interrupt event bit gets set between the read and
> the write, pciehp_isr will exit without having cleared all of the interrupt
> event bits. If this happens, and the port is using an MSI interrupt where
> per-vector masking is not supported, we won't get any more hotplug
> interrupts from that device.
>
> That is expected behavior, according to the PCI Express Base Specification
> Revision 5.0 Version 1.0, section 6.7.3.4, "Software Notification of Hot-
> Plug Events".
>
> Because the "presence detect changed" and "data link layer state changed"
> event bits are both getting set at nearly the same time when a device is
> added or removed, this is more likely to happen than it might seem. The
> issue was found (and can be reproduced rather easily) by connecting and
> disconnecting an NVMe storage device on at least one system model.
>
> This issue was found while adding and removing various NVMe storage devices
> on an AMD PCIe port (PCI device 0x1022/0x1483).
>
> This patch fixes this issue by modifying pciehp_isr() by looping back and
> re-reading the slot status register immediately after writing to it, until
> it sees that all of the event status bits have been cleared.
>
> Signed-off-by: Stuart Hayes <stuart.w.hayes@...il.com>
> ---
> v2:
> * fixed ctrl_warn() call
> * improved comments
> * added pvm_capable flag and changed pciehp_isr() to loop back only when
> pvm_capable flag not set (suggested by Lukas Wunner)
>
> drivers/pci/hotplug/pciehp.h | 3 ++
> drivers/pci/hotplug/pciehp_hpc.c | 50 ++++++++++++++++++++++++++++----
> 2 files changed, 47 insertions(+), 6 deletions(-)
>
Bjorn,
Please let me know if I could do anything to help get this patch accepted.
Thanks!
Stuart
Powered by blists - more mailing lists