lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <nf6gbtgn7lu4pvewuhrwca2efaxrkdhxazw3oktaixn5q5yg5r@7y74irve5udw>
Date: Thu, 20 Feb 2025 12:27:35 -0600
From: Lucas De Marchi <lucas.demarchi@...el.com>
To: Dave Hansen <dave.hansen@...el.com>
CC: <linux-perf-users@...r.kernel.org>, <x86@...nel.org>,
	<linux-kernel@...r.kernel.org>, <dave.hansen@...ux.intel.com>, Zhang Rui
	<rui.zhang@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, Peter Zijlstra
	<peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, Ulisses Furquim
	<ulisses.furquim@...el.com>, <intel-xe@...ts.freedesktop.org>,
	<intel-gfx@...ts.freedesktop.org>
Subject: Re: [PATCH] perf/x86/rapl: Fix PP1 event for Intel Meteor/Lunar Lake

On Thu, Feb 20, 2025 at 08:28:01AM -0800, Dave Hansen wrote:
>On 2/20/25 07:36, Lucas De Marchi wrote:
>> On some boots the read of MSR_PP1_ENERGY_STATUS msr returns 0, causing
>> perf_msr_probe() to make the power/events/energy-gpu event non-visible.
>> When that happens, the msr always read 0 until the graphics module (i915
>> for Meteor Lake, xe for Lunar Lake) is loaded. Then it starts returning
>> something different and re-loading the rapl module "fixes" it.
>
>What's the root cause here? Did the kernel do something funky? Or is
>this a hardware bug?

 From what I can see, the kernel is reading the value and deciding that "if
it's 0, it doesn't really have that", which is not really true. For
these platforms sometimes it keeps returning 0 until the gpu is
later powered on, which only happens when xe / i915 probes.

But what I don't really understand is why the behavior changes from one
boot to another. I'm assuming it depends on some funky firmware
behavior.

Lucas De Marchi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ