[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250521100035.0000544e@huawei.com>
Date: Wed, 21 May 2025 10:00:35 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Bjorn Helgaas <helgaas@...nel.org>
CC: <linux-pci@...r.kernel.org>, Jon Pan-Doh <pandoh@...gle.com>, "Karolina
Stolarek" <karolina.stolarek@...cle.com>, Weinan Liu <wnliu@...gle.com>,
Martin Petersen <martin.petersen@...cle.com>, Ben Fuller
<ben.fuller@...cle.com>, Drew Walton <drewwalton@...rosoft.com>, "Anil
Agrawal" <anilagrawal@...a.com>, Tony Luck <tony.luck@...el.com>, Ilpo
Järvinen <ilpo.jarvinen@...ux.intel.com>, "Sathyanarayanan
Kuppuswamy" <sathyanarayanan.kuppuswamy@...ux.intel.com>, Lukas Wunner
<lukas@...ner.de>, Sargun Dhillon <sargun@...a.com>, "Paul E . McKenney"
<paulmck@...nel.org>, Mahesh J Salgaonkar <mahesh@...ux.ibm.com>, "Oliver
O'Halloran" <oohall@...il.com>, Kai-Heng Feng <kaihengf@...dia.com>, "Keith
Busch" <kbusch@...nel.org>, Robert Richter <rrichter@....com>, Terry Bowman
<terry.bowman@....com>, Shiju Jose <shiju.jose@...wei.com>, Dave Jiang
<dave.jiang@...el.com>, <linux-kernel@...r.kernel.org>,
<linuxppc-dev@...ts.ozlabs.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
Krzysztof Wilczyński <kwilczynski@...nel.org>
Subject: Re: [PATCH v7 02/17] PCI/DPC: Log Error Source ID only when valid
On Tue, 20 May 2025 16:50:19 -0500
Bjorn Helgaas <helgaas@...nel.org> wrote:
> From: Bjorn Helgaas <bhelgaas@...gle.com>
>
> DPC Error Source ID is only valid when the DPC Trigger Reason indicates
> that DPC was triggered due to reception of an ERR_NONFATAL or ERR_FATAL
> Message (PCIe r6.0, sec 7.9.14.5).
>
> When DPC was triggered by ERR_NONFATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_NFE)
> or ERR_FATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_FE) from a downstream device,
> log the Error Source ID (decoded into domain/bus/device/function). Don't
> print the source otherwise, since it's not valid.
>
> For DPC trigger due to reception of ERR_NONFATAL or ERR_FATAL, the dmesg
> logging changes:
>
> - pci 0000:00:01.0: DPC: containment event, status:0x000d source:0x0200
> - pci 0000:00:01.0: DPC: ERR_FATAL detected
> + pci 0000:00:01.0: DPC: containment event, status:0x000d, ERR_FATAL received from 0000:02:00.0
>
> and when DPC triggered for other reasons, where DPC Error Source ID is
> undefined, e.g., unmasked uncorrectable error:
>
> - pci 0000:00:01.0: DPC: containment event, status:0x0009 source:0x0200
> - pci 0000:00:01.0: DPC: unmasked uncorrectable error detected
> + pci 0000:00:01.0: DPC: containment event, status:0x0009: unmasked uncorrectable error detected
>
> Previously the "containment event" message was at KERN_INFO and the
> "%s detected" message was at KERN_WARNING. Now the single message is at
> KERN_WARNING.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@...gle.com>
> Tested-by: Krzysztof Wilczyński <kwilczynski@...nel.org>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
Matches the spec conditions as far as I can tell.
I guess interesting debate on whether providing extra garbage info is
a bug or not. Maybe a fixes tag for this one as well?
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
I briefly wondered if it makes sense to have a prefix string initialized
outside the switch with "containment event, status:%#06x:"
made sense but it's probably not worth the effort and maybe makes it
harder to grep for the error messages. So in the end
I think your code here is the best option.
Jonathan
Powered by blists - more mailing lists