[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <737bea6a-2f0e-e573-754e-2e410c34013e@codeaurora.org>
Date: Mon, 21 May 2018 10:27:49 -0400
From: Tyler Baicar <tbaicar@...eaurora.org>
To: Alexandru Gagniuc <mr.nuke.me@...il.com>, bp@...en8.de
Cc: alex_gagniuc@...lteam.com, austin_bolen@...l.com,
shyam_iyer@...l.com, "Rafael J. Wysocki" <rjw@...ysocki.net>,
Len Brown <lenb@...nel.org>, Tony Luck <tony.luck@...el.com>,
Will Deacon <will.deacon@....com>,
James Morse <james.morse@....com>,
Shiju Jose <shiju.jose@...wei.com>,
"Jonathan (Zhixiong) Zhang" <zjzhang@...eaurora.org>,
Dongjiu Geng <gengdongjiu@...wei.com>,
linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 2/2] acpi: apei: Do not panic() on PCIe errors reported
through GHES
On 5/21/2018 9:49 AM, Alexandru Gagniuc wrote:
> +/* PCIe errors should not cause a panic. */
> +static int ghes_sec_pcie_severity(struct acpi_hest_generic_data *gdata)
> +{
> + struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
> +
> + if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
> + pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO &&
> + IS_ENABLED(CONFIG_ACPI_APEI_PCIEAER))
> + return GHES_SEV_RECOVERABLE;
> +
> + return ghes_cper_severity(gdata->error_severity);
> +}
> +
> +/*
> + * The severity field in the status block is an unreliable metric for the
> + * severity. A more reliable way is to look at each subsection and see how safe
> + * it is to call the approproate error handler.
> + * We're not conerned with handling the error. We're concerned with being able
> + * to notify an error handler by crossing the NMI/IRQ boundary, being able to
> + * schedule_work, and so forth.
> + * - SEC_PCIE: All PCIe errors can be handled by AER.
> + */
> +static int ghes_severity(struct ghes *ghes)
> +{
> + int worst_sev, sec_sev;
> + struct acpi_hest_generic_data *gdata;
> + const guid_t *section_type;
> + const struct acpi_hest_generic_status *estatus = ghes->estatus;
> +
> + worst_sev = GHES_SEV_NO;
> + apei_estatus_for_each_section(estatus, gdata) {
> + section_type = (guid_t *)gdata->section_type;
> + sec_sev = ghes_cper_severity(gdata->error_severity);
> +
> + if (guid_equal(section_type, &CPER_SEC_PCIE))
> + sec_sev = ghes_sec_pcie_severity(gdata);
> +
> + worst_sev = max(worst_sev, sec_sev);
> + }
> +
> + return worst_sev;
> +}
> +
> static void ghes_do_proc(struct ghes *ghes,
> const struct acpi_hest_generic_status *estatus)
> {
> @@ -944,7 +986,7 @@ static int ghes_notify_nmi(unsigned int cmd, struct pt_regs *regs)
> ret = NMI_HANDLED;
> }
>
> - sev = ghes_cper_severity(ghes->estatus->error_severity);
> + sev = ghes_severity(ghes);
Hello Alex,
There is a compile warning if CONFIG_HAVE_ACPI_APEI_NMI is not selected.
CC drivers/acpi/apei/ghes.o
drivers/acpi/apei/ghes.c:483:12: warning: ‘ghes_severity’ defined but not used
[-Wunused-function]
static int ghes_severity(struct ghes *ghes)
^~~~~~~~~~~~~
Thanks,
Tyler
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
Powered by blists - more mailing lists