[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fb2e3c60-1171-4f5c-852d-a5bfdc4f9c2a@arm.com>
Date: Wed, 7 May 2025 11:38:27 -0500
From: Jeremy Linton <jeremy.linton@....com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: "Heyne, Maximilian" <mheyne@...zon.de>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
Len Brown <lenb@...nel.org>, Sudeep Holla <sudeep.holla@....com>,
Ard Biesheuvel <ardb@...nel.org>, Catalin Marinas <catalin.marinas@....com>,
"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] ACPI/PPTT: fix off-by-one error
On 5/7/25 11:31 AM, Jeremy Linton wrote:
> On 5/7/25 11:12 AM, Rafael J. Wysocki wrote:
>> On Wed, May 7, 2025 at 5:51 PM Jeremy Linton <jeremy.linton@....com>
>> wrote:
>>>
>>> On 5/7/25 10:42 AM, Rafael J. Wysocki wrote:
>>>> On Wed, May 7, 2025 at 5:25 PM Jeremy Linton <jeremy.linton@....com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 5/6/25 8:13 AM, Heyne, Maximilian wrote:
>>>>>> Commit 7ab4f0e37a0f ("ACPI PPTT: Fix coding mistakes in a couple of
>>>>>> sizeof() calls") corrects the processer entry size but unmasked a
>>>>>> longer
>>>>>> standing bug where the last entry in the structure can get skipped
>>>>>> due
>>>>>> to an off-by-one mistake if the last entry ends exactly at the end of
>>>>>> the ACPI subtable.
>>>>>>
>>>>>> The error manifests for instance on EC2 Graviton Metal instances with
>>>>>>
>>>>>> ACPI PPTT: PPTT table found, but unable to locate core 63 (63)
>>>>>> [...]
>>>>>> ACPI: SPE must be homogeneous
>>>>>>
>>>>>> Fixes: 2bd00bcd73e5 ("ACPI/PPTT: Add Processor Properties Topology
>>>>>> Table parsing")
>>>>>> Cc: stable@...r.kernel.org
>>>>>> Signed-off-by: Maximilian Heyne <mheyne@...zon.de>
>>>>>> ---
>>>>>> drivers/acpi/pptt.c | 4 ++--
>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>>>>>> index f73ce6e13065d..4364da90902e5 100644
>>>>>> --- a/drivers/acpi/pptt.c
>>>>>> +++ b/drivers/acpi/pptt.c
>>>>>> @@ -231,7 +231,7 @@ static int acpi_pptt_leaf_node(struct
>>>>>> acpi_table_header *table_hdr,
>>>>>> sizeof(struct acpi_table_pptt));
>>>>>> proc_sz = sizeof(struct acpi_pptt_processor);
>>>>>
>>>>> This isn't really right, it should be struct acpi_subtable_header,
>>>>> then
>>>>> once the header is safe, pull the length from it.
>>>>>
>>>>> But then, really if we are trying to fix the original bug that the
>>>>> table
>>>>> could be shorter than the data in it suggests, the struct
>>>>> acpi_pptt_processor length plus its resources needs to be checked once
>>>>> the subtype is known to be a processor node.
>>>>>
>>>>> Otherwise the original sizeof * change isn't really fixing anything.
>>>>
>>>> Sorry, what sense did it make to do
>>>>
>>>> proc_sz = sizeof(struct acpi_pptt_processor *);
>>>>
>>>> here? As much as proc_sz = 0 I suppose?
>>>
>>> No, I agree, I think the original checks were simplified along the way
>>> to that. It wasn't 'right' either.
>>>
>>> The problem is that there are three subtypes of which processor is only
>>> one, and that struct acpi_pptt_processor doesn't necessarily reflect the
>>> actual size of the processor structure in the table because it has
>>> optional private resources tagged onto the end.
>>
>> Right.
>>
>>> So if the bug being fixed is that the length check is validating that
>>> the table length is less than the data in the table, that's still a
>>> problem because its only validating the processor node without
>>> resources.
>>
>> Admittedly, it is not my code, but I understand this check as a
>> termination condition for the loop: If there's not enough space in the
>> table to hold a thing that I'm looking for, I may as well bail out.
>>
>>> AKA the return is still potentially returning a pointer to a structure
>>> which may not be entirely contained in the table.
>>
>> Right, but this check should be made anyway before comparing
>> cpu_node->parent to node_entry, when it is known to be a CPU entry
>> because otherwise why bother.
>
> Right, but then there is a clarity because really its walking the
> table+subtypes looking for the cpu node. Exiting early because its not
> big enough for a cpu node makes sense but you still need the cpu node
> check to avoid a variation on the original bug.
>
>
>
>>
>> Roughly something like this:
>>
>> proc_sz = sizeof(struct acpi_pptt_processor);
>>
>> while ((unsigned long)entry + entry->length <= table_end) {
>
> Here your reading the entry, without knowing its long enough. For the
> leaf check just using struct acpi_pptt_processor is fine, but for the
> acpi_find_processor_node():
>
> proc_sz = sizeof(struct acpi_subtable_type);
Although, maybe I just wrote code that justifies using
acpi_pptt_processor here because the entry->num_of_priv_resources length
check isn't being made without it. So ok, use proc_sz = sizeof(struct
acpi_subtable_type) and assume that we don't care if the subtable type
is less than proc_sz.
>
> while ((unsigned long)entry + proc_sz <= table_end) {
> if (entry->type == ACPI_PPTT_TYPE_PROCESSOR &&
> entry->length == sizeof(struct acpi_pptt_processor) +
> entry->number_of_priv_resources * sizeof(u32) &&
> entry + entry->length <= table_end &&
> acpi_pptt_leaf_node(...))
> return (...)entry;
>
>
> Although at this point the while loops entry + proc_sz could just be <
> table_end under the assumption that entry->length will be > 0 but
> whichever makes more sense.
>
>
>
Powered by blists - more mailing lists