[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a1mHe3TkZa443fzsPnGUP1XT3w-DN3U5KAL6NBhc2nEsw@mail.gmail.com>
Date: Thu, 2 Jun 2022 18:18:05 +0200
From: Arnd Bergmann <arnd@...nel.org>
To: srinivas pandruvada <srinivas.pandruvada@...ux.intel.com>
Cc: Len Brown <len.brown@...el.com>,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Amit Kucheria <amitk@...nel.org>,
Zhang Rui <rui.zhang@...el.com>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: x86/mce/therm_throt incorrect THERM_STATUS_CLEAR_CORE_MASK?
On Thu, Jun 2, 2022 at 5:52 PM srinivas pandruvada
<srinivas.pandruvada@...ux.intel.com> wrote:
>
> On Thu, 2022-06-02 at 11:19 +0200, Arnd Bergmann wrote:
> > I have a Xeon W-2265 (family 6, model 85, stepping 7) that started
> > constantly spewing messages from the therm_throt driver after one
> > core overheated:
> >
> I think this is a Cascade Lake system. Have you tried the latest micro-
> code?
Thanks for your quick reply. I have installed the latest microcode 0x5003302
now (manually, because the version provided by the distro was still using
version 0x5003102).
After that, I tried writing the value 0x2a80 from userspace, and
that did not cause a trap, so I assume that fixed it.
It's hard to be sure, as the system has only run into the broken
state twice during its life, and now it's fine. I'll reply here if it
ever comes back with the new microcode.
Thanks a lot!
Arnd
Powered by blists - more mailing lists