[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8739c2c6-a27c-4ab6-ad74-8b95e258737e@linux.intel.com>
Date: Wed, 18 Jun 2025 07:02:55 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Vince Weaver <vincent.weaver@...ne.edu>
Cc: linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [perf] unchecked MSR access error: WRMSR to 0x3f1
On 2025-06-17 11:49 p.m., Vince Weaver wrote:
> On Tue, 17 Jun 2025, Liang, Kan wrote:
>
>> The commit 2dc0572f2cef was triggered by the fake event VLBR_EVENT.
>> But this error should be triggered by the Topdown perf metrics event,
>> INTEL_TD_METRIC_RETIRING, which uses the idx 48 internally.
>>
>> We never support perf metrics events in sampling mode. The PEBS cannot
>> be enabled in counting mode. So it's weird the cpuc->pebs_enabled has
>> the idx 48 set.
>>
>> The recent change I did for the PEBS is commit e02e9b0374c3
>> "perf/x86/intel: Support PEBS counters snapshotting". But it should not
>> impact the above.
>>
>> Could you please help on the below questions?
>> - It only happens on the p-core, right?
>
> how would I tell? I don't think the error message says what CPU it
> happens on?
No, the error message doesn't say it. Just want to check if you have
extra information. Because the Topdown perf metrics is only supported on
p-core. I want to understand whether the code messes up with e-core.
>
>> - Which kernel base do you use? Is it 6.16-rc2?
>
> I was running just before -rc1. I've updated to current git but didn't
> realize the throttle fix hadn't made it upstream yet so managed to lock up
> the machine and not sure when I'll be able to get over to reboot it.
>
They are not in rc2 as well. I guess it should be included in rc3.
>> - Can this be easily reproduced?
>
> probably. It's another thing that's a pain to check because it's a
> WARN_ONCE I think so I have to reboot in order to see. Even if it's not
> reproducible the fuzzer usually hits it within a few hours.
OK. I will try to reproduce it locally.
>
>> Is it possible to bisect the error commit? (Maybe start from the
>> commit e02e9b0374c3?)
>
> Maybe but I'd only like to do that as a last resort as it's a pain to
> build and reboot kernels on this machine (for secureboot and other
> reasons).
Sure.
Thanks,
Kan
> Also I suppose I'd have to manually apply the throttle patch
> while bisecting.
>
> Vince Weaver
> vincent.weaver@...ne.edu
>
Powered by blists - more mailing lists