[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9076062a9c0daafcc23bff616864299abc0353e8.camel@rong.moe>
Date: Tue, 07 Oct 2025 01:07:45 +0800
From: Rong Zhang <i@...g.moe>
To: Borislav Petkov <bp@...en8.de>
Cc: "Mario Limonciello (AMD) (kernel.org)" <superm1@...nel.org>, Thomas
Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Dave Hansen
<dave.hansen@...ux.intel.com>, x86@...nel.org, "H. Peter Anvin"
<hpa@...or.com>, Yazen Ghannam <yazen.ghannam@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/CPU/AMD: Prevent reset reasons from being retained
among boots
Hi Borislav,
On Mon, 2025-10-06 at 15:31 +0200, Borislav Petkov wrote:
> On Tue, Sep 30, 2025 at 06:18:30PM +0800, Rong Zhang wrote:
> > A user may feel confused: Two bits are set, but only one reason is
> > reported. Hmm... Is there a hidden failure?
>
> Why would you assume that 1 set bit == 1 failure reason?
When I first saw the for-loop, I noticed it didn't break once a reason
was found.
And Documentation/arch/x86/amd-debugging.rst says:
When a random reboot occurs, the high-level reason for the reboot is
stored in a register that will persist onto the next boot.
These made me assume the only purpose of the register was to store
reboot reasons, which further made me assume 1 set bit == 1 reason.
Yeah, I knew I made a mistake here once I read the PPR. Perhaps we
could improve the wording in the documentation? I.e., mentioning that
non-listed bits have nothing to do with reboot reasons.
> > Unless the user has read the PPR, it's hard to realize BIT(11) is
> > already set in the reset value. The debug message is here to help:
> >
> > Cleared system reset reasons [0x08000800 => 0x00000800]
> > ^ ^ ^ ^
> >
> > Now the user realizes that BIT(11) has nothing to do with reboot
> > reasons.
> >
> > This was literally the confusion I experienced. I had to take some time
> > looking for an appropriate public PPR and reading the PPR before
> > realizing this fact.
>
> Please explain in detail what confusion you were experiencing so that we can
> address it properly.
See above. I assumed BIT(11) was an undocumented reboot reason at that
time. I no longer felt confused once I read the PPR, which says "Reset:
0000_0800h."
That said, I am OK to remove the debug message printing the cleared
value. If we decide that it's better to remove it, I will send a [PATCH
v2].
> Thx.
Thanks,
Rong
Powered by blists - more mailing lists