linux-kernel - [patch 0/2] x86/microcode: Make 32-bit early loading robust and correct

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230822121936.476984181@linutronix.de>
Date:   Tue, 22 Aug 2023 14:20:07 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     LKML <linux-kernel@...r.kernel.org>
Cc:     Borislav Petkov <bp@...en8.de>, x86@...nel.org,
        Ashok Raj <ashok.raj@...el.com>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        Fenghua Yu <fenghua.yu@...el.com>, Peter Anvin <hpa@...or.com>
Subject: [patch 0/2] x86/microcode: Make 32-bit early loading robust and
 correct

While working on the microcode loader I stumbled over the 32bit early
loading mechanism which runs before paging enable in physical address mode.

Aside of being a source of code horror and debug trouble, it also turned
out that this is fundamentally broken vs. stackprotector and tracing:

 1) When stackprotector is enabled then the microcode loader code has the
    stackprotector mechanics enabled. The read from the per CPU variable
    __stack_chk_guard is always accessing the virtual address either
    directly on UP or via FS on SMP. In physical address mode this results
    in an access to memory above 3GB. So this works by chance as the
    hardware returns the same value when there is no RAM at this physical
    address. When there is RAM populated above 3G then the read is by
    chance the same as nothing changes that memory during the very early
    boot stage. That's not necessarily true during runtime CPU hotplug.

 2) When function tracing is enabled, then the relevant microcode loader
    functions and the functions invoked from there will call into the
    tracing code and evaluate global and per CPU variables in physical
    address mode. What could potentially go wrong?

While it would be possible to mark all the involved functions to be non
instrumentable and excluded from stack protector, the obvious question
arose why this is necessary at all.

The changelog of the commit which introduced this is completely useless to
answer this question and the cover letter consists mostly of handwaving
word salad without technical content:

  "The problem in current microcode loading method is that we load a
   microcode way, way too late; ideally we should load it before turning
   paging on.  This may only be practical on 32 bits since we can't get to
   64-bit mode without paging on, but we should still do it as early as at
   all possible."

So after asking around, it turned out that this was implemented to handle
the erratum of early ATOM CPUs which affects PSE.

So I sat down and did a proper technical analysis of that erratum and the
resulting requirements. It turns out that the early loading before paging
is enabled is just a pointless exercise due to lack of a thorough analysis.

More details about the reasons why this is a pointless exercise can be
found in patch 1/2.

Patch 2/2 is not microcode related, but the fallout of the analysis. It's
the only other C-function invoked on 32-bit before paging is enabled and it
is obviously affected by the stack protector issue as well and on top lacks
the removal of tracing.

The series applies on top of

    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/microcode

Thanks,

	tglx
---
 Makefile                 |    1 
 cpu/microcode/amd.c      |   31 ++-----------
 cpu/microcode/core.c     |   40 +++--------------
 cpu/microcode/intel.c    |  108 +++++------------------------------------------
 cpu/microcode/internal.h |    2 
 head32.c                 |    6 ++
 head_32.S                |   10 ----
 smpboot.c                |   12 +----
 8 files changed, 38 insertions(+), 172 deletions(-)