lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 1 Dec 2021 10:29:56 +0000
From:   Usama Arif <usama.arif@...edance.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Vivek Goyal <vgoyal@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org, linux-kernel@...r.kernel.org, tglx@...utronix.de,
        fam.zheng@...edance.com, Baoquan He <bhe@...hat.com>,
        Dave Young <dyoung@...hat.com>
Subject: Re: [External] Re: [PATCH] x86/purgatory: provide config to disable
 purgatory

Hi,

(Resending the reply as my email client had updated and inserted html 
code and caused a bounceback from the mailing list, sorry about that.)

Thanks for your reply, I have responded with further comments/questions 
inline below, and also have provided some context for the patch at the 
start:

The patch is not introducing a new CONFIG option, as can be seen in the 
v2 patch diff. It is converting an existing CONFIG option that only 
enabled purgatory for specific architectures in which it has been 
implemented (x86, powerpc and s390) to an option that can that allow 
purgatory to be disabled (with default enabled) and only provides code 
to disable purgatory for x86 only. From what i see, purgatory is 
currently not yet implemented in other architectures like arm64 
kexec_file case and riscv, so this would on a high level,
provide only the option to make kexec on x86 similar to other the other 
architectures.

As a background to the discussion, the usecase we are aiming is to 
update the host kernel with kexec in servers which is managing multiple 
VMs, to cut down the downtime of servers as much as possible so that its 
not noticeable to VMs. The patch is aimed at x86 purgatory code 
specifically. We are targetting other optimizations as well in other 
areas in boot path to cut down the 600ms time much further, so that cut 
down of 200ms downtime is significant in the usecase described. I did 
have a few comments/questions about your reply, please see my responses 
below:


On 29/11/2021 16:53, Eric W. Biederman wrote:
> Usama Arif <usama.arif@...edance.com> writes:
> 
>> Hi,
>>
>> Thanks for your replies. I have submitted a v2 of the patch with a
>> much more detailed commit message including reason for the patch and timing values.
>>
>> The time taken from reboot to running init process was measured
>> with both purgatory enabled and disabled over 20 runs and the
>> averages are:
>> Purgatory disabled:
>> - TSC = 3908766161 cycles
>> - ktime = 606.8 ms
>> Purgatory enabled:
>> - TSC = 5005811885 cycles (28.1% worse)
>> - ktime = 843.1 ms (38.9% worse)
>>
>>
>> Our reason for this patch is that it helps reduce the downtime of servers when
>> the host kernel managing multiple VMs needs to be updated via kexec,
>> but it makes reboot with kexec much faster so should be a general improvement in
>> boot time if purgatory is not needed and could have other usecases as well.
>> I believe only x86, powerpc and s390 have purgatory supported, other platforms
>> like arm64 dont have it implemented yet, so with the reboot time improvement seen,
>> it would be a good idea to have the option to disable purgatory completely but set default to y.
>> We also have the CONFIG_KEXEC_BZIMAGE_VERIFY_SIG which can be enabled to verify the next
>> kernel image to be booted and purgatory can be completely skipped if
>> not required.
> 
> CONFIG_KEXEC_BZIMAGE_VERIFY_SIG is something totally and completely
> different.  It's job is to verify that the kernel to be booted comes
> from a trusted source.   The sha256 verification in purgatory's job
> is to verify that memory the kernel cares about was not corrupted
> during the kexec process.
> 

Thanks, acknowledged.

> I believe when you say purgatory you are really talking about that
> sha256 checksum.  It really is not possible to disable all of
> the code that runs between kernels, as the old and the new kernel may
> run at the same addresses.  Anything that runs between the two kernels
> is what is referred to as purgatory.  Even if it is just a small
> assembly stub.
> 
 >

With this patch, i am trying to give an option (with default purgatory 
enabled)to disable purgatory completely on x86 only, i.e. no code 
running between 2 kernels on x86. From my understanding there is no 
purgatory in arm64 kexec_file case, in riscv and in some other archs as 
well, so I am not sure why its not possible to disable purgatory (all 
code running between 2 kernels) in x86? Unless i misunderstood something 
about the working of other platforms? In x86 case, from what i see 
relocate_kernel is still part of the older kernel and not purgatory, and 
with my patch, purgatory.ro is not built and kexec does execute 
successfully with purgatory disabled.

About your point for the old and the new kernel may run at the same 
addresses,i dont think that can happen as in bzImage64_load function its 
loaded using the kbuf struct. This doesnt happen in other architectures 
(for e.g. arm64) that dont implement purgatory and any of the tests I 
conducted with the patch applied due to the use of kbuf struct.

 From what i see in the code, purgatory in x86 specifically, has 2 main 
functions, sha256 verification and loading the %rsi register for 
bootparam_load_addr. In this patch purgatory was disabled, which 
resulted in sha256 verification disabled and bootparam_load_addr moved 
to relocate_kernel. Our analysis revealed that most of the time is spent 
in the sha256 verification, so I would be ok to reformat the patch
to make the sha verification optional and keep purgatory enabled if 
thats preferred? Although in my opinion the current patch of only 
providing an option to disable purgatory seems much better.


> That sha256 verification is always needed for kexec on panic, there are
> by the nature of a kernel panic too many unknowns to have any confidence
> the new kernel will not be corrupted in the process of kexec before it
> gets started.
> 
> For an ordinary kexec it might be possible to say that you have a
> reliable kernel shutdown process and you know for a fact that something
> won't come along and corrupt the kernel.  I find that a questionable
> assertion.  I haven't seen anyone yet whose focus when getting an
> ordinary kexec to work as anything other than making certain all of the
> drivers are shutdown properly.
> 
> I have seen countless times when a network packet comes in a the wrong
> time and the target kernel's memory is corrupted before it gets far
> enough to initialize the network driver. >

I agree that when doing testing and research, there are things that can
go wrong during kexec process, but i assume that the expectation is that
in production environment, when for e.g. updating a kernel with kexec in 
servers,the expectation is that kexec will execute successfully as long 
as there is nothing wrong in the old kernel and the new kernel.

Maybe I didn't understand this correctly, but if a network packet comes 
in and corrupts the kernel memory, wont it also cause a problem even 
when purgatory is enabled? I agree that issue like these would be caught 
earlier with purgatory, but then if something like this breaks the kexec 
process in a production environment where the old and new kernel are 
well tested, trusted and expected to work, wouldn't there be a much more 
fundamental issue with the reliability of the current kexec process.

> For a 0.2s speed up you are talking about disabling all of the safety
> checks in a very dangerous situation.  How much can you can in
> performance by optimizing the sha256 implementation instead of using
> what is essentially a reference implementation in basic C that I copied
> from somewhere long ago.
> 
> Optimize the sha256 implementation and the memory copy loop and then
> show how the tiny bit of time that is left is on a mission critical path
> and must be removed.  Then we can reasonably talk about a config option
> for disabling the sha256 implementation in the kexec in not-panic case.
> 
Thanks for this, I assume that over here you are suggesting for e.g. in 
x86 to replace the existing sha verification implementation in purgatory 
with the one in arch/x86/crypto/sha256_ssse3_glue.c? Also could you 
point to the memory copy loop to optimize? I can have a look at these 2 
options.

Even with these optimizations, i think the v2 patch should still be 
considered. The patch in the end is providing only an option to disable 
purgatory on x86 only, with the default value of keeping it enabled and 
not changing kexec behaviour. The CONFIG already existed, it is just 
renamed and now provides user the option to select to disable purgatory 
for x86, rather than it being architecture dependent.

Thanks again for the review and comments,
Usama


> That sha256 implementation in part so that we can all sleep at night
> because we don't have to deal with very very strange heizenbugs.
> 
> Eric
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ