[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5c7aa51ff9294b1aab51a3bda2416094@ausx13mps321.AMER.DELL.COM>
Date: Tue, 18 Sep 2018 13:07:58 +0000
From: <Alex_Gagniuc@...lteam.com>
To: <helgaas@...nel.org>, <mr.nuke.me@...il.com>
Cc: <linux-pci@...r.kernel.org>, <bhelgaas@...gle.com>,
<keith.busch@...el.com>, <Austin.Bolen@...l.com>,
<Shyam.Iyer@...l.com>, <Narendra.K@...l.com>,
<Stuart.Hayes@...l.com>, <lukas@...ner.de>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] PCI/MSI: Don't touch MSI bits when the PCI device is
disconnected
On 9/12/2018 4:28 PM, Bjorn Helgaas wrote:
> On Mon, Jul 30, 2018 at 04:21:44PM -0500, Alexandru Gagniuc wrote:
>> When a PCI device is gone, we don't want to send IO to it if we can
>> avoid it. We expose functionality via the irq_chip structure. As
>> users of that structure may not know about the underlying PCI device,
>> it's our responsibility to guard against removed devices.
>
> I'm pretty ambivalent about pci_dev_is_disconnected() in general, but
> I think I'll take this, given a couple minor changelog clarifications:
>
>> irq_write_msi_msg is already guarded. pci_msi_(un)mask_irq are not.
>> Guard them for completeness.
>
> By the irq_write_msi_msg() guard, I guess you mean this path:
>
> pci_msi_domain_write_msg # irq_chip.irq_write_msi_msg
> __pci_write_msi_msg
> if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev))
> /* don't touch */
Yes!
> pci_msi_(un)mask_irq() may be irq_chip.irq_mask, .irq_unmask, etc
> pointers. So these are parallel because they're all irq_chip function
> pointers, but the changelog isn't (yet) parallel because it uses the
> irq_chip pointer name for .irq_write_msi_msg but not for mask/unmask
Good catch! I'll get this corrected.
>> For example, surprise removal of a PCIe device triggers teardown. This
>> touches the irq_chips ops some point to disable the interrupts. I/O
>> generated here can crash the system on machines with buggy firmware.
>> Not triggering the IO in the first place eliminates the problem.
>
> It doesn't eliminate the problem completely because .irq_mask() and
> .irq_unmask() may be called for reasons other than surprise removal,
> and if a surprise removal happens after the pci_dev_is_disconnected()
> check but before the readl(), we will still generate I/O to a device
> that's gone. I'd be OK if you said it "reduces" the problem.
That sounds reasonable.
> One reason I'm ambivalent about pci_dev_is_disconnected() is that in
> cases like this, it turns a reproducible problem into a very
> hard-to-reproduce problem, which reduces the likelihood that the buggy
> firmware will be fixed.
If it manages to turn this into 99.999% territory, I'll be much happier.
I'd love to give you an academically correct solution, but I just don't
see how, given how firmware-first philosophy is written.
> Do you have information about known platforms with this buggy firmware
> and the signature of the crash? If you do, it's always nice to be
> able to connect a patch with the user-visible problem it fixes.
From what I've heard, it won't be fixed. The number of changes needed
would require re-qualifying the firmware. I'm told that's very hard to
do on platforms that are shipping. I can reword this to say
"firmware-first" instead of "buggy" since they are mostly synonymous.
Alex
>> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@...il.com>
>> ---
>>
>> There's another patch by Lukas Wunner that is needed (not yet published)
>> in order to fully block IO on SURPRISE!!! removal. The existing code only
>> sets the PCI_DEV_DISCONNECTED bit in an unreasonably narrow set of
>> circumstances. Lukas' patch fixes that.
>>
>> However, this change is otherwise fully independent, and enjoy!
>>
>> drivers/pci/msi.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>> index 4d88afdfc843..5f47b5cb0401 100644
>> --- a/drivers/pci/msi.c
>> +++ b/drivers/pci/msi.c
>> @@ -227,6 +227,9 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag)
>> {
>> struct msi_desc *desc = irq_data_get_msi_desc(data);
>>
>> + if (pci_dev_is_disconnected(msi_desc_to_pci_dev(desc)))
>> + return;
>> +
>> if (desc->msi_attrib.is_msix) {
>> msix_mask_irq(desc, flag);
>> readl(desc->mask_base); /* Flush write to device */
>> --
>> 2.17.1
>>
>
Powered by blists - more mailing lists