lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Jun 2022 22:42:09 +0200
From:   Arnd Bergmann <arnd@...nel.org>
To:     srinivas pandruvada <srinivas.pandruvada@...ux.intel.com>
Cc:     Len Brown <len.brown@...el.com>,
        Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Amit Kucheria <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: x86/mce/therm_throt incorrect THERM_STATUS_CLEAR_CORE_MASK?

On Thu, Jun 2, 2022 at 10:10 PM srinivas pandruvada
<srinivas.pandruvada@...ux.intel.com> wrote:
> On Thu, 2022-06-02 at 20:53 +0200, Arnd Bergmann wrote:
> >
> > I wonder how common this problem it is. Would it help to add a driver
> > workaround
> > like this?
> This issue affects only certain skews. The others already working as
> expected. These are important log bits for debug, we don't want to
> clear in this path. Printing warning for CLX stepping is fine without
> clearing unrelated bits 13 and 15.
> Read-modify-update should always work where we only update the bits of
> interest. Writing 1s to this register should be NOP.

The patch I suggested doesn't change the behavior unless the initial
write causes an exception. As long as only buggy microcode rejects the
write, the second write just serves to clear the state that causes the
repeated stack dumps.

       Arnd

> > @@ -214,7 +214,13 @@ static void clear_therm_status_log(int level)
> >
> >         rdmsrl(msr, msr_val);
> >         msr_val &= mask;
> > -       wrmsrl(msr, msr_val & ~THERM_STATUS_PROCHOT_LOG);
> > +       if (wrmsrl_safe(msr, msr_val & ~THERM_STATUS_PROCHOT_LOG)) {
> > +               /* work around Cascade Lake SKZ57 erratum */
> > +               printk_once(KERN_WARNING "Failed to update IA32_THERM_STATUS, "
> > +                                       "please upgrade microcode\n");
> > +               wrmsrl(msr, msr_val & ~THERM_STATUS_PROCHOT_LOG &
> > +                       ~BIT(13) & ~BIT(15));
> > +       }
> >  }
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ