[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6ba76ff-7cbd-2d73-fdc4-41aa8c788bc9@linux.intel.com>
Date: Tue, 20 May 2025 13:28:02 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Bjorn Helgaas <helgaas@...nel.org>
cc: linux-pci@...r.kernel.org, Jon Pan-Doh <pandoh@...gle.com>,
Karolina Stolarek <karolina.stolarek@...cle.com>,
Martin Petersen <martin.petersen@...cle.com>,
Ben Fuller <ben.fuller@...cle.com>, Drew Walton <drewwalton@...rosoft.com>,
Anil Agrawal <anilagrawal@...a.com>, Tony Luck <tony.luck@...el.com>,
Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@...ux.intel.com>,
Lukas Wunner <lukas@...ner.de>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Sargun Dhillon <sargun@...a.com>, "Paul E . McKenney" <paulmck@...nel.org>,
Mahesh J Salgaonkar <mahesh@...ux.ibm.com>,
Oliver O'Halloran <oohall@...il.com>, Kai-Heng Feng <kaihengf@...dia.com>,
Keith Busch <kbusch@...nel.org>, Robert Richter <rrichter@....com>,
Terry Bowman <terry.bowman@....com>, Shiju Jose <shiju.jose@...wei.com>,
Dave Jiang <dave.jiang@...el.com>, LKML <linux-kernel@...r.kernel.org>,
linuxppc-dev@...ts.ozlabs.org, Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH v6 02/16] PCI/DPC: Log Error Source ID only when valid
On Mon, 19 May 2025, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@...gle.com>
>
> DPC Error Source ID is only valid when the DPC Trigger Reason indicates
> that DPC was triggered due to reception of an ERR_NONFATAL or ERR_FATAL
> Message (PCIe r6.0, sec 7.9.14.5).
>
> When DPC was triggered by ERR_NONFATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_NFE)
> or ERR_FATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_FE) from a downstream device,
> log the Error Source ID (decoded into domain/bus/device/function). Don't
> print the source otherwise, since it's not valid.
>
> For DPC trigger due to reception of ERR_NONFATAL or ERR_FATAL, the dmesg
> logging changes:
>
> - pci 0000:00:01.0: DPC: containment event, status:0x000d source:0x0200
> - pci 0000:00:01.0: DPC: ERR_FATAL detected
> + pci 0000:00:01.0: DPC: containment event, status:0x000d, ERR_FATAL received from 0000:02:00.0
>
> and when DPC triggered for other reasons, where DPC Error Source ID is
> undefined, e.g., unmasked uncorrectable error:
>
> - pci 0000:00:01.0: DPC: containment event, status:0x0009 source:0x0200
> - pci 0000:00:01.0: DPC: unmasked uncorrectable error detected
> + pci 0000:00:01.0: DPC: containment event, status:0x0009: unmasked uncorrectable error detected
>
> Previously the "containment event" message was at KERN_INFO and the
> "%s detected" message was at KERN_WARNING. Now the single message is at
> KERN_WARNING.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@...gle.com>
> ---
> drivers/pci/pcie/dpc.c | 45 ++++++++++++++++++++++++++----------------
> 1 file changed, 28 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
> index fe7719238456..315bf2bfd570 100644
> --- a/drivers/pci/pcie/dpc.c
> +++ b/drivers/pci/pcie/dpc.c
> @@ -261,25 +261,36 @@ void dpc_process_error(struct pci_dev *pdev)
> struct aer_err_info info = { 0 };
>
> pci_read_config_word(pdev, cap + PCI_EXP_DPC_STATUS, &status);
> - pci_read_config_word(pdev, cap + PCI_EXP_DPC_SOURCE_ID, &source);
> -
> - pci_info(pdev, "containment event, status:%#06x source:%#06x\n",
> - status, source);
>
> reason = status & PCI_EXP_DPC_STATUS_TRIGGER_RSN;
> - ext_reason = status & PCI_EXP_DPC_STATUS_TRIGGER_RSN_EXT;
> - pci_warn(pdev, "%s detected\n",
> - (reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_UNCOR) ?
> - "unmasked uncorrectable error" :
> - (reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_NFE) ?
> - "ERR_NONFATAL" :
> - (reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_FE) ?
> - "ERR_FATAL" :
> - (ext_reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_RP_PIO) ?
> - "RP PIO error" :
> - (ext_reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_SW_TRIGGER) ?
> - "software trigger" :
> - "reserved error");
> +
> + switch (reason) {
> + case PCI_EXP_DPC_STATUS_TRIGGER_RSN_UNCOR:
> + pci_warn(pdev, "containment event, status:%#06x: unmasked uncorrectable error detected\n",
> + status);
> + break;
> + case PCI_EXP_DPC_STATUS_TRIGGER_RSN_NFE:
> + case PCI_EXP_DPC_STATUS_TRIGGER_RSN_FE:
> + pci_read_config_word(pdev, cap + PCI_EXP_DPC_SOURCE_ID,
> + &source);
> + pci_warn(pdev, "containment event, status:%#06x, %s received from %04x:%02x:%02x.%d\n",
> + status,
> + (reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_FE) ?
> + "ERR_FATAL" : "ERR_NONFATAL",
> + pci_domain_nr(pdev->bus), PCI_BUS_NUM(source),
> + PCI_SLOT(source), PCI_FUNC(source));
> + return;
> + case PCI_EXP_DPC_STATUS_TRIGGER_RSN_IN_EXT:
> + ext_reason = status & PCI_EXP_DPC_STATUS_TRIGGER_RSN_EXT;
> + pci_warn(pdev, "containment event, status:%#06x: %s detected\n",
> + status,
> + (ext_reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_RP_PIO) ?
> + "RP PIO error" :
> + (ext_reason == PCI_EXP_DPC_STATUS_TRIGGER_RSN_SW_TRIGGER) ?
> + "software trigger" :
> + "reserved error");
> + break;
> + }
>
> /* show RP PIO error detail information */
> if (pdev->dpc_rp_extensions &&
>
After adding that switch (reason) there, wouldn't it make sense to move
also the code from the if blocks into the case blocks? That if
conditions check for reason anyway so those if branches would naturally
belong under one of the cases each.
--
i.
Powered by blists - more mailing lists