[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0684f47fa5c333b971fd9ccb1f3e86844fb5ddea.camel@linux.intel.com>
Date: Thu, 28 Aug 2025 04:54:55 -0700
From: srinivas pandruvada <srinivas.pandruvada@...ux.intel.com>
To: David Arcari <darcari@...hat.com>, platform-driver-x86@...r.kernel.org
Cc: Hans de Goede <hansg@...nel.org>, Ilpo Järvinen
<ilpo.jarvinen@...ux.intel.com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, Dan Carpenter <dan.carpenter@...aro.org>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, Tero Kristo
<tero.kristo@...ux.intel.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] platform/x86/intel: power-domains validate domain in
tpmi_cpu_online()
On Thu, 2025-08-28 at 07:27 -0400, David Arcari wrote:
>
>
> On 8/27/25 7:49 AM, srinivas pandruvada wrote:
> > On Tue, 2025-08-26 at 21:39 -0400, David Arcari wrote:
> > >
> > > Hi Srinivas,
> > >
> > > On 8/26/25 4:26 PM, srinivas pandruvada wrote:
> > > > Hi David,
> > > >
> > > > On Tue, 2025-08-26 at 12:43 -0400, David Arcari wrote:
> > > > > Although tpmi_get_power_domain_mask() calls
> > > > > tpmi_domain_is_valid()
> > > > > prior to indexing tpmi_power_domain_mask[],
> > > > Because this an API call so that caller parameter needs to be
> > > > sanitized.
> > > >
> > > > > tpmi_cpu_online() does
> > > > > not.
> > > > This is hotplug callback, which should have correct topology
> > > > information.
> > > >
> > > > > In the case where a VM creates non-contiguous package ids
> > > > > the
> > > > > result can be memory corruption. This can be prevented by
> > > > > adding
> > > > > the same validation in tpmi_cpu_online().
> > > > >
> > > >
> > > > This driver is getting loaded means MSR 0x54 is virtualised
> > > > otherwise
> > > > this driver will not load.
> > >
> > > I don't have direct access to the system, but this appears to be
> > > the
> > > case. The driver is reading MSR 0x54:
> > >
> > > drivers/platform/x86/intel/tpmi_power_domains.c:#define
> > > MSR_PM_LOGICAL_ID 0x54
> > > drivers/platform/x86/intel/tpmi_power_domains.c: ret =
> > > rdmsrl_safe(MSR_PM_LOGICAL_ID, &data);
> > > drivers/platform/x86/intel/tpmi_power_domains.c: ret =
> > > rdmsrl_safe(MSR_PM_LOGICAL_ID, &data);
> > >
> > >
> > > > Not sure this is an upstream kernel or not.
> > >
> > > This was not an upstream kernel, but I don't see anything in the
> > > upstream driver that would have prevented the access that is
> > > occurring.
> > >
> > The issue here the topology_max_packages() is 2 but cpu 1 package
> > ID is
> > also 2. So everywhere topology_max_packages() is used there may be
> > issue as you have to verify the package ID is fine.
>
> Correct. There are two logical packages on this system, but they are
> 0
> and 2.
>
> >
> >
> > Repost the patch by adding the above root cause in the description,
> > so
> > we know why we need this change.
>
> I will do that.
>
Also this customer contacted me. In their case they don't GP fault on
the MSR 0x54 so driver is loaded.
Thanks,
Srinivas
> Best,
> -DA
>
> >
> > Thanks,
> > Srinivas
> >
> > > >
> > > > Some comments below.
> > > >
> > > > > Fixes: 17ca2780458c ("platform/x86/intel: TPMI domain id and
> > > > > CPU
> > > > > mapping")
> > > > >
> > > > Andy already pointed about new line here.
> > > >
> > > > > Cc: Hans de Goede <hansg@...nel.org>
> > > > > Cc: "Ilpo Järvinen" <ilpo.jarvinen@...ux.intel.com>
> > > > > Cc: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
> > > > > Cc: Peter Zijlstra <peterz@...radead.org>
> > > > > Cc: Ingo Molnar <mingo@...nel.org>
> > > > > Cc: Dan Carpenter <dan.carpenter@...aro.org>
> > > > > Cc: David Arcari <darcari@...hat.com>
> > > > > Cc: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
> > > > > Cc: Tero Kristo <tero.kristo@...ux.intel.com>
> > > > > Cc: linux-kernel@...r.kernel.org
> > > > > Signed-off-by: David Arcari <darcari@...hat.com>
> > > > > ---
> > > > > drivers/platform/x86/intel/tpmi_power_domains.c | 3 +++
> > > > > 1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/drivers/platform/x86/intel/tpmi_power_domains.c
> > > > > b/drivers/platform/x86/intel/tpmi_power_domains.c
> > > > > index 9d8247bb9cfa..ae5b58679e29 100644
> > > > > --- a/drivers/platform/x86/intel/tpmi_power_domains.c
> > > > > +++ b/drivers/platform/x86/intel/tpmi_power_domains.c
> > > > > @@ -194,6 +194,9 @@ static int tpmi_cpu_online(unsigned int
> > > > > cpu)
> > > > > if (ret)
> > > > > return 0;
> > > > >
> > > > Need some more information.
> > > >
> > > > The only case this check is required if
> > > > topology_physical_package_id(cpu) is returning greater or equal
> > > > to
> > > > topology_max_packages(). If this true in this case, please
> > > > check
> > > > the
> > > > value of info->pkg_id. If this is bad then then some other
> > > > places
> > > > also
> > > > this may have issue. info->punit_domain_id is already checked
> > > > for
> > > > valid
> > > > value in tpmi_get_logical_id().
> > >
> > > That is correct. In the case of the crash we have:
> > >
> > > crash> p/x __max_logical_packages
> > > $1 = 0x2
> > >
> > > static inline unsigned int topology_max_packages(void)
> > > {
> > > return __max_logical_packages;
> > > }
> > >
> > >
> > > $2 = {
> > > hnode = {
> > > next = 0xffff9651bbc37010,
> > > pprev = 0xffffffffc0b7a640 <tpmi_cpu_hash>
> > > },
> > > linux_cpu = 1,
> > > pkg_id = 2 '\002',
> > > punit_thread_id = 0 '\000',
> > > punit_core_id = 0 '\000',
> > > punit_domain_id = 0 '\000'
> > > }
> > >
> > > The pkg_id of 2 leads to the bad reference.
> > >
> > > FWIW this change has been tested and resolves the issue.
> > >
> > > Let me know if there is any other information I can provide. I
> > > will
> > > be
> > > out of the office on Wednesday, so response may be delayed.
> > >
> > > Best,
> > > -DA
> > >
> > > >
> > > > Thanks,
> > > > Srinivas
> > > >
> > > > > + if (!tpmi_domain_is_valid(info))
> > > > > + return 0;
> > > > > +
> > > > > index = info->pkg_id * MAX_POWER_DOMAINS + info-
> > > > > > punit_domain_id;
> > > > >
> > > > > guard(mutex)(&tpmi_lock);
> > > >
> > > >
> > >
> >
>
Powered by blists - more mailing lists