lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y2vQRMyOndQtG/yJ@a4bf019067fa.jf.intel.com>
Date:   Wed, 9 Nov 2022 08:07:32 -0800
From:   Ashok Raj <ashok.raj@...el.com>
To:     Borislav Petkov <bp@...en8.de>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        LKML Mailing List <linux-kernel@...r.kernel.org>,
        X86-kernel <x86@...nel.org>, Tony Luck <tony.luck@...el.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Arjan van de Ven <arjan.van.de.ven@...el.com>,
        Andy Lutomirski <luto@...nel.org>,
        "Jacon Jun Pan" <jacob.jun.pan@...el.com>,
        Tom Lendacky <thomas.lendacky@....com>,
        "Kai Huang" <kai.huang@...el.com>,
        Andrew Cooper <andrew.cooper3@...rix.com>,
        "Ashok Raj" <ashok.raj@...el.com>
Subject: Re: [v2 03/13] x86/microcode/intel: Fix a hang if early loading
 microcode fails

On Wed, Nov 09, 2022 at 12:25:02PM +0100, Borislav Petkov wrote:
> On Thu, Nov 03, 2022 at 05:58:51PM +0000, Ashok Raj wrote:
> > When early loading of microcode fails for any reason other than the wrong
> > family-model-stepping, Linux can get into an infinite loop retrying the
> > same failed load.
> > 
> > A single retry is needed to handle any mixed stepping case.
> > 
> > Assume we have a microcode that fails to load for some reason.
> > load_ucode_ap() seems to retry if the loading fails. But it searches for
> 
> Seems to retry because we were supporting mixed revisions. Which we do
> not now.

The retry wasn't the problem, but hitting the same failed microcode over
and over is the problem. It is called out in the commit log.

As part of dropping mixed stepping, we can drop this retry.

Maybe the right way is to remember if the bsp failed, then there is no
point in trying to apply on the AP's. 

reload_early_microcode->reload_ucode_intel()
                               ->apply_microcode_intel() 

we aren't checking if early load failed for bsp, we should save and
skip loading on all AP's.

> 
> And if you say "seems" then this sounds like the problem hasn't been
> analyzed properly. If this can happen with the current code, then this
> needs to be fixed in stable. So, how do you trigger exactly?
> 
> I'd like to reproduce it myself.

Certainly, take the fms+pf of the platform you are testing. 

- Take a microcode file from the distribution for a different fms that didn't
  belong to the one you are testing.
- You will have to fake the external header data and change it to the one
  you want microcode match to work 
- recompute all checksums and use that file instead of the original file.

I accidently ran into it since I had a copy of debug uCode that require
additional steps before loading.

I have a tool that I can change to give you some production microcode that
will fail in your platform. Just provide me with the fms+pf values, and I
an provide one for  your test. 

Let me know if you need one for testing.

> 
> As to this patch: it should simply be removing the retrying instead of
> doing silly crap like
> 
> 	bool retried = false;
> 
> ...
> 
> In light of how a lot has changed since last time, yes, please redo the
> patchset ontop of tip:x86/microcode, keeping in mind now that we don't
> support mixed revisions anymore.
> 
> Just like dhansen said, you can split it in fixes and new features so
> that it is not too many patches at once - your call.


That makes sense, I'll send the bug fix patches separately.

Cheers,
Ashok

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ