linux-kernel - Re: [PATCH 1/2] perf/x86/intel: Fix IA32_PMC_x_CFG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <15e4f7a6-7c68-4d89-8813-cd142eb4c416@linux.intel.com>
Date: Tue, 3 Jun 2025 10:14:35 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Ian Rogers <irogers@...gle.com>,
 Adrian Hunter <adrian.hunter@...el.com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Kan Liang <kan.liang@...ux.intel.com>, Andi Kleen <ak@...ux.intel.com>,
 Eranian Stephane <eranian@...gle.com>, linux-kernel@...r.kernel.org,
 linux-perf-users@...r.kernel.org, Dapeng Mi <dapeng1.mi@...el.com>
Subject: Re: [PATCH 1/2] perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access
 error


On 5/31/2025 4:01 PM, Ingo Molnar wrote:
> * Dapeng Mi <dapeng1.mi@...ux.intel.com> wrote:
>
>> When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
>>  access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.
>>
>> [   55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
>> [   55.611280] Call Trace:
>> [   55.611282]  <TASK>
>> [   55.611284]  ? intel_pmu_config_acr+0x87/0x160
>> [   55.611289]  intel_pmu_enable_acr+0x6d/0x80
>> [   55.611291]  intel_pmu_enable_event+0xce/0x460
>> [   55.611293]  x86_pmu_start+0x78/0xb0
>> [   55.611297]  x86_pmu_enable+0x218/0x3a0
>> [   55.611300]  ? x86_pmu_enable+0x121/0x3a0
>> [   55.611302]  perf_pmu_enable+0x40/0x50
>> [   55.611307]  ctx_resched+0x19d/0x220
>> [   55.611309]  __perf_install_in_context+0x284/0x2f0
>> [   55.611311]  ? __pfx_remote_function+0x10/0x10
>> [   55.611314]  remote_function+0x52/0x70
>> [   55.611317]  ? __pfx_remote_function+0x10/0x10
>> [   55.611319]  generic_exec_single+0x84/0x150
>> [   55.611323]  smp_call_function_single+0xc5/0x1a0
>> [   55.611326]  ? __pfx_remote_function+0x10/0x10
>> [   55.611329]  perf_install_in_context+0xd1/0x1e0
>> [   55.611331]  ? __pfx___perf_install_in_context+0x10/0x10
>> [   55.611333]  __do_sys_perf_event_open+0xa76/0x1040
>> [   55.611336]  __x64_sys_perf_event_open+0x26/0x30
>> [   55.611337]  x64_sys_call+0x1d8e/0x20c0
>> [   55.611339]  do_syscall_64+0x4f/0x120
>> [   55.611343]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
>> thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR
>> which requires to enable auto counter reload on GP counter 0.
>>
>> The root cause of causing this issue is the check for auto counter
>> reload (ACR) counter mask from user space is incorrect in
>> intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter
>> mask from user space could be set into hw.config1 and then written into
>> CFG_B MSRs and trigger the MSR access warning.
>>
>> e.g., User may create a perf event with ACR counter mask (config2=0xcb),
>> and there is only 1 event created, so "cpuc->n_events" is 1.
>>
>> The correct check condition should be "i + idx >= cpuc->n_events"
>> instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise,
>> the counter mask would traverse twice and an invalid "cpuc->assign[1]"
>> bit (bit 0) is set into hw.config1 and cause MSR accessing error.
>>
>> Besides, also check if the ACR counter mask corresponding events are
>> ACR events. If not, filter out these counter mask. If a event is not a
>> ACR event, it could be scheduled to an HW counter which doesn't support
>> ACR. It's invalid to add their counter index in ACR counter mask.
>>
>> Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as
>> user could set any invalid ACR counter mask and the warning message
>> could mislead users.
>>
>> Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
>> Signed-off-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
>> ---
>>  arch/x86/events/intel/core.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 3a319cf6d364..8d046b1a237e 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -2994,7 +2994,8 @@ static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc)
>>  			if (event->group_leader != leader->group_leader)
>>  				break;
>>  			for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
>> -				if (WARN_ON_ONCE(i + idx > cpuc->n_events))
>> +				if (i + idx >= cpuc->n_events ||
>> +				    !is_acr_event_group(cpuc->event_list[i + idx]))
>>  					return;
> Is this a normal condition?
>
>  - If it's normal then the 'return' is destructive, isn't it? Shouldn't 
>    it be a 'break', if this condition is legitimate?
>
>  - If it's not normal, then the WARN_ON_ONCE() was justified, right?

It's not normal.Strictly speaking, it's an invalid user configuration. It
looks not reasonable to trigger a kernel space warning for an invalid user
space configuration. It would mislead users and let users think there is
something wrong in kernel. That's why to remove the WARN_ON_ONCE(). Thanks.


>
> Thanks,
>
> 	Ingo