[an error occurred while processing this directive]
| 
| [an error occurred while processing this directive] |  | 
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8b53854e-f407-4c58-badc-01327d2d4be0@linux.intel.com>
Date: Thu, 30 Oct 2025 09:37:55 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Zide Chen <zide.chen@...el.com>, Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Ian Rogers <irogers@...gle.com>,
 Adrian Hunter <adrian.hunter@...el.com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Andi Kleen <ak@...ux.intel.com>, Eranian Stephane <eranian@...gle.com>
Cc: linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
 Xudong Hao <xudong.hao@...el.com>, Falcon Thomas <thomas.falcon@...el.com>,
 Steve Wahl <steve.wahl@....com>
Subject: Re: [PATCH 1/2] perf/x86/intel/uncore: Skip discovery table for
 offline dies
On 10/30/2025 6:07 AM, Zide Chen wrote:
> This warning can be triggered if NUMA is disabled and the system
> boots with fewer CPUs than the number of CPUs in die 0.
>
> WARNING: CPU: 9 PID: 7257 at uncore.c:1157 uncore_pci_pmu_register+0x136/0x160 [intel_uncore]
>
> Currently, the discovery table continues to be parsed even if all CPUs
> in the associated die are offline. This can lead to an array overflow
> at "pmu->boxes[die] = box" in uncore_pci_pmu_register(), which may
> trigger the warning above or cause other issues.
>
> Reported-by: Steve Wahl <steve.wahl@....com>
> Fixes: edae1f06c2cd ("perf/x86/intel/uncore: Parse uncore discovery tables")
> Signed-off-by: Zide Chen <zide.chen@...el.com>
> ---
>  arch/x86/events/intel/uncore.c           | 4 ++++
>  arch/x86/events/intel/uncore_discovery.c | 2 +-
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index ee586eb714ec..5c3aeea5c78d 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -1380,6 +1380,10 @@ static void uncore_pci_pmus_register(void)
>  
>  		for (node = rb_first(type->boxes); node; node = rb_next(node)) {
>  			unit = rb_entry(node, struct intel_uncore_discovery_unit, node);
> +
> +			if (WARN_ON(unit->die >= uncore_max_dies()))
Base on my understanding, it seems an valid situation which could happen.
If so, we'd better remove the WARN_on to avoid it mislead users. Thanks.
> +				continue;
> +
>  			pdev = pci_get_domain_bus_and_slot(UNCORE_DISCOVERY_PCI_DOMAIN(unit->addr),
>  							   UNCORE_DISCOVERY_PCI_BUS(unit->addr),
>  							   UNCORE_DISCOVERY_PCI_DEVFN(unit->addr));
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> index 1bf6e4288577..d6aee12139f1 100644
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
> @@ -388,7 +388,7 @@ static bool intel_uncore_has_discovery_tables_pci(int *ignore)
>  				     (val & UNCORE_DISCOVERY_DVSEC2_BIR_MASK) * UNCORE_DISCOVERY_BIR_STEP;
>  
>  			die = get_device_die_id(dev);
> -			if (die < 0)
> +			if ((die < 0) || (die >= uncore_max_dies()))
>  				continue;
>  
>  			parse_discovery_table(dev, die, bar_offset, &parsed, ignore);
Powered by blists - more mailing lists
 
