[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190910061815.GA40059@gmail.com>
Date: Tue, 10 Sep 2019 08:18:15 +0200
From: Ingo Molnar <mingo@...nel.org>
To: "Kirill A. Shutemov" <kirill@...temov.name>
Cc: Steve Wahl <steve.wahl@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Juergen Gross <jgross@...e.com>,
Brijesh Singh <brijesh.singh@....com>,
Jordan Borgner <mail@...dan-borgner.de>,
Feng Tang <feng.tang@...el.com>, linux-kernel@...r.kernel.org,
Baoquan He <bhe@...hat.com>, russ.anderson@....com,
dimitri.sivanich@....com, mike.travis@....com
Subject: Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid
outside kernel area.
* Kirill A. Shutemov <kirill@...temov.name> wrote:
> On Fri, Sep 06, 2019 at 04:29:50PM -0500, Steve Wahl wrote:
> > Our hardware (UV aka Superdome Flex) has address ranges marked
> > reserved by the BIOS. These ranges can cause the system to halt if
> > accessed.
> >
> > During kernel initialization, the processor was speculating into
> > reserved memory causing system halts. The processor speculation is
> > enabled because the reserved memory is being mapped by the kernel.
> >
> > The page table level2_kernel_pgt is 1 GiB in size, and had all pages
> > initially marked as valid, and the kernel is placed anywhere in this
> > range depending on the virtual address selected by KASLR. Later on in
> > the boot process, the valid area gets trimmed back to the space
> > occupied by the kernel.
> >
> > But during the interval of time when the full 1 GiB space was marked
> > as valid, if the kernel physical address chosen by KASLR was close
> > enough to our reserved memory regions, the valid pages outside the
> > actual kernel space were allowing the processor to issue speculative
> > accesses to the reserved space, causing the system to halt.
> >
> > This was encountered somewhat rarely on a normal system boot, and
> > somewhat more often when starting the crash kernel if
> > "crashkernel=512M,high" was specified on the command line (because
> > this heavily restricts the physical address of the crash kernel,
> > usually to within 1 GiB of our reserved space).
> >
> > The answer is to invalidate the pages of this table outside the
> > address range occupied by the kernel before the page table is
> > activated. This patch has been validated to fix this problem on our
> > hardware.
>
> If the goal is to avoid *any* mapping of the reserved region to stop
> speculation, I don't think this patch will do the job. We still (likely)
> have the same memory mapped as part of the identity mapping. And it
> happens at least in two places: here and before on decompression stage.
Yeah, this really needs a fix at the KASLR level: it should only ever map
into regions that are fully RAM backed.
Is the problem that the 1 GiB mapping is a direct mapping, which can be
speculated into? I presume KASLR won't accidentally map the kernel into
the reserved region, right?
Thanks,
Ingo
Powered by blists - more mailing lists