lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67ebb7d5-418c-4eaf-aa94-8ecf76bd44eb@redhat.com>
Date: Thu, 28 Aug 2025 07:27:36 -0400
From: David Arcari <darcari@...hat.com>
To: srinivas pandruvada <srinivas.pandruvada@...ux.intel.com>,
 platform-driver-x86@...r.kernel.org
Cc: Hans de Goede <hansg@...nel.org>,
 Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
 Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
 Dan Carpenter <dan.carpenter@...aro.org>,
 Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
 Tero Kristo <tero.kristo@...ux.intel.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] platform/x86/intel: power-domains validate domain in
 tpmi_cpu_online()



On 8/27/25 7:49 AM, srinivas pandruvada wrote:
> On Tue, 2025-08-26 at 21:39 -0400, David Arcari wrote:
>>
>> Hi Srinivas,
>>
>> On 8/26/25 4:26 PM, srinivas pandruvada wrote:
>>> Hi David,
>>>
>>> On Tue, 2025-08-26 at 12:43 -0400, David Arcari wrote:
>>>> Although tpmi_get_power_domain_mask() calls
>>>> tpmi_domain_is_valid()
>>>> prior to indexing tpmi_power_domain_mask[],
>>> Because this an API call so that caller parameter needs to be
>>> sanitized.
>>>
>>>>    tpmi_cpu_online() does
>>>> not.
>>> This is hotplug callback, which should have correct topology
>>> information.
>>>
>>>>    In the case where a VM creates non-contiguous package ids the
>>>> result can be memory corruption. This can be prevented by adding
>>>> the same validation in tpmi_cpu_online().
>>>>
>>>
>>> This driver is getting loaded means MSR 0x54 is virtualised
>>> otherwise
>>> this driver will not load.
>>
>> I don't have direct access to the system, but this appears to be the
>> case.  The driver is reading MSR 0x54:
>>
>> drivers/platform/x86/intel/tpmi_power_domains.c:#define
>> MSR_PM_LOGICAL_ID       0x54
>> drivers/platform/x86/intel/tpmi_power_domains.c:        ret =
>> rdmsrl_safe(MSR_PM_LOGICAL_ID, &data);
>> drivers/platform/x86/intel/tpmi_power_domains.c:        ret =
>> rdmsrl_safe(MSR_PM_LOGICAL_ID, &data);
>>
>>
>>> Not sure this is an upstream kernel or not.
>>
>> This was not an upstream kernel, but I don't see anything in the
>> upstream driver that would have prevented the access that is
>> occurring.
>>
> The issue here the topology_max_packages() is 2 but cpu 1 package ID is
> also 2. So everywhere topology_max_packages() is used there may be
> issue as you have to verify the package ID is fine.

Correct.  There are two logical packages on this system, but they are 0 
and 2.

> 
> 
> Repost the patch by adding the above root cause in the description, so
> we know why we need this change.

I will do that.

Best,
-DA

> 
> Thanks,
> Srinivas
> 
>>>
>>> Some comments below.
>>>
>>>> Fixes: 17ca2780458c ("platform/x86/intel: TPMI domain id and CPU
>>>> mapping")
>>>>
>>> Andy already pointed about new line here.
>>>
>>>> Cc: Hans de Goede <hansg@...nel.org>
>>>> Cc: "Ilpo Järvinen" <ilpo.jarvinen@...ux.intel.com>
>>>> Cc: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
>>>> Cc: Peter Zijlstra <peterz@...radead.org>
>>>> Cc: Ingo Molnar <mingo@...nel.org>
>>>> Cc: Dan Carpenter <dan.carpenter@...aro.org>
>>>> Cc: David Arcari <darcari@...hat.com>
>>>> Cc: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
>>>> Cc: Tero Kristo <tero.kristo@...ux.intel.com>
>>>> Cc: linux-kernel@...r.kernel.org
>>>> Signed-off-by: David Arcari <darcari@...hat.com>
>>>> ---
>>>>    drivers/platform/x86/intel/tpmi_power_domains.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/platform/x86/intel/tpmi_power_domains.c
>>>> b/drivers/platform/x86/intel/tpmi_power_domains.c
>>>> index 9d8247bb9cfa..ae5b58679e29 100644
>>>> --- a/drivers/platform/x86/intel/tpmi_power_domains.c
>>>> +++ b/drivers/platform/x86/intel/tpmi_power_domains.c
>>>> @@ -194,6 +194,9 @@ static int tpmi_cpu_online(unsigned int cpu)
>>>>    	if (ret)
>>>>    		return 0;
>>>>    
>>> Need some more information.
>>>
>>> The only case this check is required if
>>> topology_physical_package_id(cpu) is returning greater or equal to
>>> topology_max_packages(). If this true in this case, please check
>>> the
>>> value of info->pkg_id. If this is bad then then some other places
>>> also
>>> this may have issue. info->punit_domain_id is already checked for
>>> valid
>>> value in tpmi_get_logical_id().
>>
>> That is correct.  In the case of the crash we have:
>>
>> crash> p/x __max_logical_packages
>> $1 = 0x2
>>
>> static inline unsigned int topology_max_packages(void)
>> {
>> 	return __max_logical_packages;
>> }
>>
>>
>> $2 = {
>>     hnode = {
>>       next = 0xffff9651bbc37010,
>>       pprev = 0xffffffffc0b7a640 <tpmi_cpu_hash>
>>     },
>>     linux_cpu = 1,
>>     pkg_id = 2 '\002',
>>     punit_thread_id = 0 '\000',
>>     punit_core_id = 0 '\000',
>>     punit_domain_id = 0 '\000'
>> }
>>
>> The pkg_id of 2 leads to the bad reference.
>>
>> FWIW this change has been tested and resolves the issue.
>>
>> Let me know if there is any other information I can provide.  I will
>> be
>> out of the office on Wednesday, so response may be delayed.
>>
>> Best,
>> -DA
>>
>>>
>>> Thanks,
>>> Srinivas
>>>
>>>> +	if (!tpmi_domain_is_valid(info))
>>>> +		return 0;
>>>> +
>>>>    	index = info->pkg_id * MAX_POWER_DOMAINS + info-
>>>>> punit_domain_id;
>>>>    
>>>>    	guard(mutex)(&tpmi_lock);
>>>
>>>
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ