linux-kernel - Re: [tip: x86/urgent] x86/setup: Always reserve the first 1M of RAM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <50938d7d-0a09-4a5b-8593-39bfe55c2dc0@app.fastmail.com>
Date:   Thu, 02 Mar 2023 08:59:59 -0800
From:   "Andy Lutomirski" <luto@...nel.org>
To:     "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Cc:     "Mike Rapoport" <rppt@...ux.ibm.com>,
        "Borislav Petkov" <bp@...e.de>, "Hugh Dickins" <hughd@...gle.com>,
        "the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: [tip: x86/urgent] x86/setup: Always reserve the first 1M of RAM



On Wed, Mar 1, 2023, at 7:51 PM, Andy Lutomirski wrote:
> On Thu, Jun 3, 2021, at 11:01 AM, tip-bot2 for Mike Rapoport wrote:
>> The following commit has been merged into the x86/urgent branch of tip:
>>
>> Commit-ID:     f1d4d47c5851b348b7713007e152bc68b94d728b
>> Gitweb:        
>> https://git.kernel.org/tip/f1d4d47c5851b348b7713007e152bc68b94d728b
>> Author:        Mike Rapoport <rppt@...ux.ibm.com>
>> AuthorDate:    Tue, 01 Jun 2021 10:53:52 +03:00
>> Committer:     Borislav Petkov <bp@...e.de>
>> CommitterDate: Thu, 03 Jun 2021 19:57:55 +02:00
>>
>> x86/setup: Always reserve the first 1M of RAM
>>
>> There are BIOSes that are known to corrupt the memory under 1M, or more
>> precisely under 640K because the memory above 640K is anyway reserved
>> for the EGA/VGA frame buffer and BIOS.
>>
>> To prevent usage of the memory that will be potentially clobbered by the
>> kernel, the beginning of the memory is always reserved. The exact size
>> of the reserved area is determined by CONFIG_X86_RESERVE_LOW build time
>> and the "reservelow=" command line option. The reserved range may be
>> from 4K to 640K with the default of 64K. There are also configurations
>> that reserve the entire 1M range, like machines with SandyBridge graphic
>> devices or systems that enable crash kernel.
>>
>> In addition to the potentially clobbered memory, EBDA of unknown size may
>> be as low as 128K and the memory above that EBDA start is also reserved
>> early.
>>
>> It would have been possible to reserve the entire range under 1M unless for
>> the real mode trampoline that must reside in that area.
>>
>> To accommodate placement of the real mode trampoline and keep the memory
>> safe from being clobbered by BIOS, reserve the first 64K of RAM before
>> memory allocations are possible and then, after the real mode trampoline
>> is allocated, reserve the entire range from 0 to 1M.
>>
>> Update trim_snb_memory() and reserve_real_mode() to avoid redundant
>> reservations of the same memory range.
>>
>> Also make sure the memory under 1M is not getting freed by
>> efi_free_boot_services().
>
> This is quite broken.  The comments in the patch seem to understand 
> that Linux tries twice to allocate the real mode trampoline, but the 
> code has some issues.
>
>
> First, it actively breaks the logic here:
>
>
> +               /*
> +                * Don't free memory under 1M for two reasons:
> +                * - BIOS might clobber it
> +                * - Crash kernel needs it to be reserved
> +                */
> +               if (start + size < SZ_1M)
> +                       continue;
> +               if (start < SZ_1M) {
> +                       size -= (SZ_1M - start);
> +                       start = SZ_1M;
> +               }
> +
>
>
> The whole point is that, if we fail to allocate a trampoline, we free 
> boot services and try again.  But if we can't free boot services below 
> 1M, then we can't allocate a trampoline in boot services memory.  And 
> then it does:
>
>
> +       /*
> +        * Unconditionally reserve the entire fisrt 1M, see comment in
> +        * setup_arch().
> +        */
> +       memblock_reserve(0, SZ_1M);
>

My apologies, I misread this thing.  The patch is *not* obviously buggy, but something is buggy.  I'll keep investigating...

--Andy