[<prev] [next>] [day] [month] [year] [list]
Message-ID: <b2d9284b-10fb-4ec9-921e-c73b0f79f01f@linux.alibaba.com>
Date: Tue, 9 Dec 2025 10:42:22 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>
Cc: Ard Biesheuvel <ardb@...nel.org>, Borislav Petkov <bp@...en8.de>,
Breno Leitao <leitao@...ian.org>, Dave Jiang <dave.jiang@...el.com>,
Fan Ni <fan.ni@...sung.com>, Hanjun Guo <guohanjun@...wei.com>,
Huang Yiwei <quic_hyiwei@...cinc.com>, Ira Weiny <ira.weiny@...el.com>,
Jason Tian <jason@...amperecomputing.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>, Len Brown <lenb@...nel.org>,
Mauro Carvalho Chehab <mchehab@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
Tony Luck <tony.luck@...el.com>, linux-acpi@...r.kernel.org,
linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] apei/ghes: don't go past the ARM processor CPER
record buffer
在 2025/11/28 18:53, Mauro Carvalho Chehab 写道:
> There's a logic inside ghes/cper to detect if the section_length
> is too small, but it doesn't detect if it is too big.
>
> Currently, if the firmware receives an ARM processor CPER record
> stating that a section length is big, kernel will blindly trust
> section_lentgh, producing a very long dump. For instance, a 67
> bytes record with ERR_INFO_NUM set 46198 and section length
> set to 854918320 would dump a lot of data going a way past the
> firmware memory-mapped area.
>
> Fix it by adding a logic to prevent it to go past the buffer
> if ERR_INFO_NUM is too big, making it report instead:
>
> [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
> [Hardware Error]: event severity: recoverable
> [Hardware Error]: Error 0, type: recoverable
> [Hardware Error]: section_type: ARM processor error
> [Hardware Error]: MIDR: 0xff304b2f8476870a
> [Hardware Error]: section length: 854918320, CPER size: 67
> [Hardware Error]: section length is too big
> [Hardware Error]: firmware-generated error record is incorrect
> [Hardware Error]: ERR_INFO_NUM is 46198
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@...nel.org>
> ---
> drivers/acpi/apei/ghes.c | 13 +++++++++++++
> drivers/firmware/efi/cper-arm.c | 14 +++++++++-----
> drivers/firmware/efi/cper.c | 3 ++-
> include/linux/cper.h | 3 ++-
> 4 files changed, 26 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 56107aa00274..8b90b6f3e866 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -557,6 +557,7 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
> {
> struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
> int flags = sync ? MF_ACTION_REQUIRED : 0;
> + int length = gdata->error_data_length;
> char error_type[120];
> bool queued = false;
> int sec_sev, i;
> @@ -568,7 +569,12 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
> return false;
>
> p = (char *)(err + 1);
> + length -= sizeof(err);
> +
> for (i = 0; i < err->err_info_num; i++) {
> + if (length <= 0)
> + break;
> +
Hi, Mauro,
The bounds checking logic is duplicated - it appears both in the cache
error handling branch and after it. This could be simplified. It would
be better to ensure we have enough data for the error info header in one
check.
/* Ensure we have enough data for the error info header */
if (length < sizeof(struct cper_arm_err_info))
break;
And it would be better to validate the claimed length before using it.
/* Validate the claimed length before using it */
if (err_info->length < sizeof(struct cper_arm_err_info) ||
err_info->length > length)
break;
Thanks.
Shuai
Powered by blists - more mailing lists