[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXG3jiLahONhPkKD0VSngDnMQoUCkDmoCsWEzOHDZmhTiA@mail.gmail.com>
Date: Mon, 6 May 2024 12:38:32 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: Mike Rapoport <rppt@...nel.org>
Cc: Kees Cook <keescook@...omium.org>, Steven Rostedt <rostedt@...dmis.org>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
Masami Hiramatsu <mhiramat@...nel.org>, Mark Rutland <mark.rutland@....com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Andrew Morton <akpm@...ux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka <vbabka@...e.cz>,
Lorenzo Stoakes <lstoakes@...il.com>, linux-mm@...ck.org,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Peter Zijlstra <peterz@...radead.org>, Tony Luck <tony.luck@...el.com>,
"Guilherme G. Piccoli" <gpiccoli@...lia.com>, linux-hardening@...r.kernel.org,
Guenter Roeck <linux@...ck-us.net>, Ross Zwisler <zwisler@...gle.com>, wklin@...gle.com,
Vineeth Remanan Pillai <vineeth@...byteword.org>, Joel Fernandes <joel@...lfernandes.org>,
Suleiman Souhlal <suleiman@...gle.com>, Linus Torvalds <torvalds@...uxfoundation.org>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>
Subject: Re: [POC][RFC][PATCH 1/2] mm/x86: Add wildcard * option as memmap=nn*align:name
On Wed, 1 May 2024 at 16:59, Mike Rapoport <rppt@...nel.org> wrote:
>
> On Mon, Apr 15, 2024 at 10:22:53AM -0700, Kees Cook wrote:
> > On Fri, Apr 12, 2024 at 06:19:40PM -0400, Steven Rostedt wrote:
> > > On Fri, 12 Apr 2024 23:59:07 +0300
> > > Mike Rapoport <rppt@...nel.org> wrote:
> > >
> > > > On Tue, Apr 09, 2024 at 04:41:24PM -0700, Kees Cook wrote:
> > > > > On Tue, Apr 09, 2024 at 07:11:56PM -0400, Steven Rostedt wrote:
> > > > > > On Tue, 9 Apr 2024 15:23:07 -0700
> > > > > > Kees Cook <keescook@...omium.org> wrote:
> > > > > >
> > > > > > > Do we need to involve e820 at all? I think it might be possible to just
> > > > > > > have pstore call request_mem_region() very early? Or does KASLR make
> > > > > > > that unstable?
> > > > > >
> > > > > > Yeah, would that give the same physical memory each boot, and can we
> > > > > > guarantee that KASLR will not map the kernel over the previous location?
> > > > >
> > > > > Hm, no, for physical memory it needs to get excluded very early, which
> > > > > means e820.
> > > >
> > > > Whatever memory is reserved in arch/x86/kernel/e820.c, that happens after
> > > > kaslr, so to begin with, a new memmap parameter should be also added to
> > > > parse_memmap in arch/x86/boot/compressed/kaslr.c to ensure the same
> > > > physical address will be available after KASLR.
> > >
> > > But doesn't KASLR only affect virtual memory not physical memory?
> >
> > KASLR for x86 (and other archs, like arm64) do both physical and virtual
> > base randomization.
> >
> > > This just makes sure the physical memory it finds will not be used by the
> > > system. Then ramoops does the mapping via vmap() I believe, to get a
> > > virtual address to access the physical address.
> >
> > I was assuming, since you were in the e820 code, that it was
> > manipulating that before KASLR chose a location. But if not, yeah, Mike
> > is right -- you need to make sure this is getting done before
> > decompress_kernel().
>
> Right now kaslr can handle up to 4 memmap regions and parse_memmap() in
> arch/x86/boot/compressed/kaslr.c should be updated for a new memmap type.
>
> But I think it's better to add a new kernel parameter as I suggested in
> another email and teach mem_avoid_memmap() in kaslr.c to deal with it, as
> well as with crashkernel=size@...set, btw.
>
The logic in arch/x86/boot/compressed/kaslr.c is now only used by non-EFI boot.
In general, I am highly skeptical that hopes and prayers are enough to
prevent the firmware from stepping on such a region, unless this is
only a best effort thing, and failures are acceptable. For instance,
booting an EFI system with/without an external display attached, or
with a USB device inserted (without even using it during boot) will
impact the memory map, to the extent that the E820 table derived from
it may look different. (EFI tries to keep the runtime regions in the
same place but the boot-time regions are allocated/freed on demand)
So I would strongly urge to address this properly, and work with
firmware folks to define some kind of protocol for this.
Powered by blists - more mailing lists