lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 May 2023 19:44:51 +0200
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Tom Lendacky <thomas.lendacky@....com>
Cc:     linux-efi@...r.kernel.org, linux-kernel@...r.kernel.org,
        Evgeniy Baskov <baskov@...ras.ru>,
        Borislav Petkov <bp@...en8.de>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Alexey Khoroshilov <khoroshilov@...ras.ru>,
        Peter Jones <pjones@...hat.com>,
        Gerd Hoffmann <kraxel@...hat.com>,
        Dave Young <dyoung@...hat.com>,
        Mario Limonciello <mario.limonciello@....com>,
        Kees Cook <keescook@...omium.org>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH 0/6] efi/x86: Avoid legacy decompressor during EFI boot

On Tue, 2 May 2023 at 18:08, Tom Lendacky <thomas.lendacky@....com> wrote:
>
> On 5/2/23 08:39, Ard Biesheuvel wrote:
> > On Tue, 2 May 2023 at 15:37, Tom Lendacky <thomas.lendacky@....com> wrote:
> >>
> >> On 4/24/23 11:57, Ard Biesheuvel wrote:
> >>> This series is conceptually a combination of Evgeny's series [0] and
> >>> mine [1], both of which attempt to make the early decompressor code more
> >>> amenable to executing in the EFI environment with stricter handling of
> >>> memory permissions.
> >>>
> >>> My series [1] implemented zboot for x86, by getting rid of the entire
> >>> x86 decompressor, and replacing it with existing EFI code that does the
> >>> same but in a generic way. The downside of this is that only EFI boot is
> >>> supported, making it unviable for distros, which need to support BIOS
> >>> boot and hybrid EFI boot modes that omit the EFI stub.
> >>>
> >>> Evgeny's series [0] adapted the entire decompressor code flow to allow
> >>> it to execute in the EFI context as well as the bare metal context, and
> >>> this involves changes to the 1:1 mapping code and the page fault
> >>> handlers etc, none of which are really needed when doing EFI boot in the
> >>> first place.
> >>>
> >>> So this series attempts to occupy the middle ground here: it makes
> >>> minimal changes to the existing decompressor so some of it can be called
> >>> from the EFI stub. Then, it reimplements the EFI boot flow to decompress
> >>> the kernel and boot it directly, without relying on the trampoline code,
> >>> page table code or page fault handling code. This allows us to get rid
> >>> of quite a bit of unsavory EFI stub code, and replace it with two clear
> >>> invocations of the EFI firmware APIs to clear NX restrictions from
> >>> allocations that have been populated with executable code.
> >>>
> >>> The only code that is being reused is the decompression library itself,
> >>> along with the minimal ELF parsing that is required to copy the ELF
> >>> segments in place, and the relocation processing that fixes up absolute
> >>> symbol references to refer to the correct virtual addresses.
> >>>
> >>> Note that some of Evgeny's changes to clean up the PE/COFF header
> >>> generation will still be needed, but I've omitted those here for
> >>> brevity.
> >>
> >> I tried booting an SEV and an SEV-ES guest using this and both failed to boot:
> >>
> >> EFI stub: WARNING: Decompression failed: Out of memory while allocating
> >> z_stream
> >>
> >> I'll have to take a closer look as to why, but it might be a couple of
> >> days before I can get to it.
> >>
> >
> > Thanks Tom.
> >
> > The internal malloc() seems to be failing, which is often caused by
> > BSS clearing problems. Could you elaborate a little bit on the boot
> > environment you are using here?
>
> I'm using Qemu v7.2.1 as my VMM, Linux 6.3 with your series applied for my
> host/hypervisor and guest kernel and the current OVMF tree built using
> OvmfPkgX64.dsc.
>
> I was originally using the current merge window Linux, but moved to the
> release version just to . With the release version SEV and SEV-ES still fail to
> boot, but SEV actually #GPs now. And some of the register contents look
> like encrypted data:
>
> ConvertPages: range 1000000 - 4FA1FFF covers multiple entries
> !!!! X64 Exception Type - 0D(#GP - General Protection)  CPU Apic ID - 00000000 !!!!
> ExceptionData - 0000000000000000
> RIP  - 00000000597E71C1, CS  - 0000000000000038, RFLAGS - 0000000000210206
> RAX  - 1FBA02A45943B920, RCX - 0000000000AF7009, RDX - A9DAE761B64A1F1B
> RBX  - 1FBA02A45943B8C0, RSP - 000000007FD97320, RBP - 0000000002000000
> RSI  - 0000000001000000, RDI - 1FBA02A45943DE68
> R8   - 0000000003EF3C94, R9  - 0000000000000000, R10 - 000000007D7C6018
> R11  - 0000000000000000, R12 - 0000000001000000, R13 - 00000000597EDD98
> R14  - 0000000001000000, R15 - 000000007E0A5198
> DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> GS   - 0000000000000030, SS  - 0000000000000030
> CR0  - 0000000080010033, CR2 - 0000000000000000, CR3 - 000000007FA01000
> CR4  - 0000000000000668, CR8 - 0000000000000000
> DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> GDTR - 000000007F7DC000 0000000000000047, LDTR - 0000000000000000
> IDTR - 000000007F34C018 0000000000000FFF,   TR - 0000000000000000
> FXSAVE_STATE - 000000007FD96F80
> !!!! Find image based on IP(0x597E71C1) /root/kernels/ovmf-build-X64/Build/OvmfX64/DEBUG_GCC5/X64/MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe/DEBUG/Variable
> RuntimeDxe.dll (ImageBase=0000000000D4792C, EntryPoint=0000000000D50CC3) !!!!
>
> So, yes, probably an area of memory that was zeroes when mapped
> unencrypted, but wasn't cleared after changing the mapping to
> encrypted.
>

Thanks.

It seems I was a bit naive and underestimated the amount of SEV
related processing that goes on in the decompressor after the EFI stub
has handed over. I will have to take some time and go through this,
and decide whether there is a way we can share this code with the EFI
stub without introducing yet another permutation that requires testing
and maintenance.

Any suggestions on how to test this stuff is appreciated - does QEMU
emulate any of this? Does consumer-level AMD hardware implement the
pieces I'd need to run a SEV host with SNP support etc?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ