lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWmCrxyYcG308NBDzp1OMX6kG7PyVbpx3NDsuJCPW7D8A@mail.gmail.com>
Date:	Thu, 21 Jul 2016 07:58:16 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Ingo Molnar <mingo@...nel.org>, Matt Fleming <mfleming@...e.de>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Mario Limonciello <mario_limonciello@...l.com>,
	Kees Cook <keescook@...omium.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Matthew Garrett <mjg59@...f.ucam.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] x86/boot: Reorganize and clean up the BIOS area
 reservation code

On Jul 21, 2016 1:14 AM, "Ingo Molnar" <mingo@...nel.org> wrote:
>
>
> * Andy Lutomirski <luto@...nel.org> wrote:
>
> > Under some conditions, my Dell XPS 13 9350 puts the EBDA at 0x2c000
> > but reports the lowmem cutoff as 0.  The old code reserves
> > everything above 0x2c000 and I can't boot [1].
>
> > [1] This only breaks boot in practice when some other firmware or
> >     GRUB oddity that I don't fully understand kicks in causing the
> >     memory below 0x2c000 to be unusable.
>
> Exactly why can't Linux boot if *more* memory is reserved? Is it perhaps the SMP
> trampoline that cannot be allocated?

Yes, exactly.

>
> Is the boot failure deterministic - if yes, could you try to dig a bit more into
> this?

It's mostly deterministic.  I hit it every time if I use Dell's latest
BIOS (1.4.4), enable SGX in BIOS (no SGX kernel patches involved), and
boot using Fedora's grub2-efi on the hard disk.  I don't hit it on a
USB stick or if I boot using the EFI stub via the EFI shell.  Using
EFI shell causes 1000-27fff to be conventional memory instead of boot
data -- see below.

Here's my memory map:

[    0.000000] efi: mem00: [Runtime Data       |RUN|  |  |  |  |  |  |   |WB|WT|
WC|UC] range=[0x0000000000000000-0x0000000000000fff] (0MB)
[    0.000000] efi: mem01: [Boot Data          |   |  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x0000000000001000-0x0000000000027fff] (0MB)
[    0.000000] efi: mem02: [Loader Data        |   |  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x0000000000028000-0x0000000000029fff] (0MB)
[    0.000000] efi: mem03: [Reserved           |   |  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x000000000002a000-0x000000000002bfff] (0MB)
[    0.000000] efi: mem04: [Runtime Data       |RUN|  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x000000000002c000-0x000000000002cfff] (0MB)
[    0.000000] efi: mem05: [Loader Data        |   |  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x000000000002d000-0x000000000002dfff] (0MB)
[    0.000000] efi: mem06: [Conventional Memory|   |  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x000000000002e000-0x0000000000057fff] (0MB)
[    0.000000] efi: mem07: [Reserved           |   |  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x0000000000058000-0x0000000000058fff] (0MB)
[    0.000000] efi: mem08: [Conventional Memory|   |  |  |  |  |  |  |
  |WB|WT|WC|UC] range=[0x0000000000059000-0x000000000009ffff] (0MB)
[

The EFI quirk to reserve boot data kills 1000-27fff.  The EBDA
reservation code kills the rest, leaving no <1MB memory at all.

>
> My guess it's the SMP trampoline, and I think we should robustify that in a
> different way: lets put it aside very early as a reservation (possibly in this
> very function), to guarantee that we have a below 1MB buffer for the SMP
> trampoline. This would be a lot more robust ...
>

If we really want to robustify that, I would suggest that we change
the way that the trampoline works.  In particular, I don't see any
reason why we need to call setup_real_mode until we're actually ready
to initialize APs, and we should be done with the boot services data
quirk by then (am I right, Matt?).  So if we can get the allocation
code right, we shouldn't have any problem putting the trampoline in
the boot services range.

It would be very easy to implement this if we could handle overlapping
memblocks precisely or set a lower limit on the memblock allocator.
Then we could block off everything below 1MB or 2MB very early and
then unblock it or temporarily change the lower limit and ask for a
single page for the trampoline after that.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ