lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <adbc0e8b-199a-42af-a45e-cb3791923554@redhat.com>
Date: Tue, 26 Aug 2025 21:39:35 -0400
From: David Arcari <darcari@...hat.com>
To: srinivas pandruvada <srinivas.pandruvada@...ux.intel.com>,
 platform-driver-x86@...r.kernel.org
Cc: Hans de Goede <hansg@...nel.org>,
 Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
 Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
 Dan Carpenter <dan.carpenter@...aro.org>,
 Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
 Tero Kristo <tero.kristo@...ux.intel.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] platform/x86/intel: power-domains validate domain in
 tpmi_cpu_online()


Hi Srinivas,

On 8/26/25 4:26 PM, srinivas pandruvada wrote:
> Hi David,
> 
> On Tue, 2025-08-26 at 12:43 -0400, David Arcari wrote:
>> Although tpmi_get_power_domain_mask() calls tpmi_domain_is_valid()
>> prior to indexing tpmi_power_domain_mask[],
> Because this an API call so that caller parameter needs to be
> sanitized.
> 
>>   tpmi_cpu_online() does
>> not.
> This is hotplug callback, which should have correct topology
> information.
> 
>>   In the case where a VM creates non-contiguous package ids the
>> result can be memory corruption. This can be prevented by adding
>> the same validation in tpmi_cpu_online().
>>
> 
> This driver is getting loaded means MSR 0x54 is virtualised otherwise
> this driver will not load.

I don't have direct access to the system, but this appears to be the 
case.  The driver is reading MSR 0x54:

drivers/platform/x86/intel/tpmi_power_domains.c:#define 
MSR_PM_LOGICAL_ID       0x54
drivers/platform/x86/intel/tpmi_power_domains.c:        ret = 
rdmsrl_safe(MSR_PM_LOGICAL_ID, &data);
drivers/platform/x86/intel/tpmi_power_domains.c:        ret = 
rdmsrl_safe(MSR_PM_LOGICAL_ID, &data);


> Not sure this is an upstream kernel or not.

This was not an upstream kernel, but I don't see anything in the 
upstream driver that would have prevented the access that is occurring.

> 
> Some comments below.
> 
>> Fixes: 17ca2780458c ("platform/x86/intel: TPMI domain id and CPU
>> mapping")
>>
> Andy already pointed about new line here.
> 
>> Cc: Hans de Goede <hansg@...nel.org>
>> Cc: "Ilpo Järvinen" <ilpo.jarvinen@...ux.intel.com>
>> Cc: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
>> Cc: Peter Zijlstra <peterz@...radead.org>
>> Cc: Ingo Molnar <mingo@...nel.org>
>> Cc: Dan Carpenter <dan.carpenter@...aro.org>
>> Cc: David Arcari <darcari@...hat.com>
>> Cc: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
>> Cc: Tero Kristo <tero.kristo@...ux.intel.com>
>> Cc: linux-kernel@...r.kernel.org
>> Signed-off-by: David Arcari <darcari@...hat.com>
>> ---
>>   drivers/platform/x86/intel/tpmi_power_domains.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/platform/x86/intel/tpmi_power_domains.c
>> b/drivers/platform/x86/intel/tpmi_power_domains.c
>> index 9d8247bb9cfa..ae5b58679e29 100644
>> --- a/drivers/platform/x86/intel/tpmi_power_domains.c
>> +++ b/drivers/platform/x86/intel/tpmi_power_domains.c
>> @@ -194,6 +194,9 @@ static int tpmi_cpu_online(unsigned int cpu)
>>   	if (ret)
>>   		return 0;
>>   
> Need some more information.
> 
> The only case this check is required if
> topology_physical_package_id(cpu) is returning greater or equal to
> topology_max_packages(). If this true in this case, please check the
> value of info->pkg_id. If this is bad then then some other places also
> this may have issue. info->punit_domain_id is already checked for valid
> value in tpmi_get_logical_id().

That is correct.  In the case of the crash we have:

crash> p/x __max_logical_packages
$1 = 0x2

static inline unsigned int topology_max_packages(void)
{
	return __max_logical_packages;
}


$2 = {
   hnode = {
     next = 0xffff9651bbc37010,
     pprev = 0xffffffffc0b7a640 <tpmi_cpu_hash>
   },
   linux_cpu = 1,
   pkg_id = 2 '\002',
   punit_thread_id = 0 '\000',
   punit_core_id = 0 '\000',
   punit_domain_id = 0 '\000'
}

The pkg_id of 2 leads to the bad reference.

FWIW this change has been tested and resolves the issue.

Let me know if there is any other information I can provide.  I will be 
out of the office on Wednesday, so response may be delayed.

Best,
-DA

> 
> Thanks,
> Srinivas
> 
>> +	if (!tpmi_domain_is_valid(info))
>> +		return 0;
>> +
>>   	index = info->pkg_id * MAX_POWER_DOMAINS + info-
>>> punit_domain_id;
>>   
>>   	guard(mutex)(&tpmi_lock);
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ