linux-kernel - Re: [RFC PATCH 1/2] Revert "x86/kexec/64: Prevent kexec from 5-level paging to a 4-level only kernel"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240305115533.GBZecINWGlb73W0nQS@fat_crate.local>
Date: Tue, 5 Mar 2024 12:55:33 +0100
From: Borislav Petkov <bp@...en8.de>
To: Baoquan He <bhe@...hat.com>
Cc: X86 ML <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
	dyoung@...hat.com
Subject: Re: [RFC PATCH 1/2] Revert "x86/kexec/64: Prevent kexec from 5-level
 paging to a 4-level only kernel"

On Tue, Mar 05, 2024 at 11:43:01AM +0800, Baoquan He wrote:
> Guess you mean upstream kernel doesn't care about 'customers'. Downstream
> kernel does care about customers.

You know very well what I mean. You're at Red Hat - I was at SUSE for
a decade. You know exactly well what the distinction is.

> Hmm, there's different view between upstream and downstream. For distros
> kernel, we need a lot of testing to make sure one kernel is trustworthy
> as kdump kernel. Here, 'a lot of testing' means a long list of user cases
> for kexec/kdump. Please see below file from centos kexec-tools package:
> 
> https://git.centos.org/rpms/kexec-tools/blob/bb7919506eba39a2b7277c8d36fe1774f9c33428/f/SOURCES/supported-kdump-targets.txt
> 
> And the kdump kernel doesn't have to be the same kernel as the 1st kernel.

This "example" basically proves my point. None of those dump targets
talk about architecture support - this is all drivers.

> I can give several examples:
> 
> 1) Nvidia GPU or AMD GPU doesn't work well when kexec/kdump jumping to
> 2nd kernel in some releases. When we meet that case, we want to use the
> newer kernel as 1st kernel. we also want to deploy kdump kernel to
> capture the vmcore for analyzing once corruption encountered. Then the
> old kernel which have been tested and prove to be working well can be
> configured as 2nd kernel.

Same as above - nothing to do with architecture support. Both kernels
can and will have 5level support because you won't do two kernel images:
one with and one without 5level.

> E.g kdump kernel is too old, or like this 5-level case, jumping from
> 5-level to 4-level will fail.

5level support is present upstream since when?

$ git describe 6fb895692a034
v4.11-rc1-97-g6fb895692a03

There's no sensible kdump use case where you jump between 4.12 *and*
6.10, depending on when we revert this.

> No, it's not true. Kexec-tools doesn't check,

No, it is true. kexec-tools does *NOT* use those flags. Vs

"The flags will be used by the kernel kexec subsystem and the userspace
 kexec tools."

from f2d08c5d3bcf3f7ef788af122b57a919efa1e9d0.

> If we take off the checking, and people want to jump from the new kernel
> to an old kernel where 5-level kernel code haven't been added or
> CONFIG_X86_5LEVEL is unset on purpose, it won't fail and prompt message at
> all until 2nd kernel booting silently failed. E.g, the coming RHEL10 anchor
> a upstream kernel w/o the flag checking, people want to kexec/kdump jump
> from rhel10 to an old rhel7 kernel. It could be an extreme case, while
> revealing the scenario.

That is the only valid reason you've given until now. Yes, that makes
sense - the removal of those flags should go together with the removal
of CONFIG_X86_5LEVEL and making this feature unconditional.

Because, practically, that config item is enabled on every relevant
x86 kernel config out there. It would be silly if not.

/me puts on TODO.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette