lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33d9b711-288c-e5f4-0cc8-365c01861da5@bytedance.com>
Date:   Mon, 6 Dec 2021 16:51:09 +0000
From:   Usama Arif <usama.arif@...edance.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Vivek Goyal <vgoyal@...hat.com>
Cc:     Borislav Petkov <bp@...en8.de>, x86@...nel.org,
        linux-kernel@...r.kernel.org, tglx@...utronix.de,
        fam.zheng@...edance.com, Baoquan He <bhe@...hat.com>,
        Dave Young <dyoung@...hat.com>
Subject: Re: [External] Re: [PATCH] x86/purgatory: provide config to disable
 purgatory

Hi,

I have sent a v3 of the patch with a much clearer commit message, please
let me know if there are any comments for the v3 patch or my responses 
below.

Thanks,
Usama

On 01/12/2021 10:29, Usama Arif wrote:
> Hi,
> 
> (Resending the reply as my email client had updated and inserted html 
> code and caused a bounceback from the mailing list, sorry about that.)
> 
> Thanks for your reply, I have responded with further comments/questions 
> inline below, and also have provided some context for the patch at the 
> start:
> 
> The patch is not introducing a new CONFIG option, as can be seen in the 
> v2 patch diff. It is converting an existing CONFIG option that only 
> enabled purgatory for specific architectures in which it has been 
> implemented (x86, powerpc and s390) to an option that can that allow 
> purgatory to be disabled (with default enabled) and only provides code 
> to disable purgatory for x86 only. From what i see, purgatory is 
> currently not yet implemented in other architectures like arm64 
> kexec_file case and riscv, so this would on a high level,
> provide only the option to make kexec on x86 similar to other the other 
> architectures.
> 
> As a background to the discussion, the usecase we are aiming is to 
> update the host kernel with kexec in servers which is managing multiple 
> VMs, to cut down the downtime of servers as much as possible so that its 
> not noticeable to VMs. The patch is aimed at x86 purgatory code 
> specifically. We are targetting other optimizations as well in other 
> areas in boot path to cut down the 600ms time much further, so that cut 
> down of 200ms downtime is significant in the usecase described. I did 
> have a few comments/questions about your reply, please see my responses 
> below:
> 
> 
> On 29/11/2021 16:53, Eric W. Biederman wrote:
>> Usama Arif <usama.arif@...edance.com> writes:
>>
>>> Hi,
>>>
>>> Thanks for your replies. I have submitted a v2 of the patch with a
>>> much more detailed commit message including reason for the patch and 
>>> timing values.
>>>
>>> The time taken from reboot to running init process was measured
>>> with both purgatory enabled and disabled over 20 runs and the
>>> averages are:
>>> Purgatory disabled:
>>> - TSC = 3908766161 cycles
>>> - ktime = 606.8 ms
>>> Purgatory enabled:
>>> - TSC = 5005811885 cycles (28.1% worse)
>>> - ktime = 843.1 ms (38.9% worse)
>>>
>>>
>>> Our reason for this patch is that it helps reduce the downtime of 
>>> servers when
>>> the host kernel managing multiple VMs needs to be updated via kexec,
>>> but it makes reboot with kexec much faster so should be a general 
>>> improvement in
>>> boot time if purgatory is not needed and could have other usecases as 
>>> well.
>>> I believe only x86, powerpc and s390 have purgatory supported, other 
>>> platforms
>>> like arm64 dont have it implemented yet, so with the reboot time 
>>> improvement seen,
>>> it would be a good idea to have the option to disable purgatory 
>>> completely but set default to y.
>>> We also have the CONFIG_KEXEC_BZIMAGE_VERIFY_SIG which can be enabled 
>>> to verify the next
>>> kernel image to be booted and purgatory can be completely skipped if
>>> not required.
>>
>> CONFIG_KEXEC_BZIMAGE_VERIFY_SIG is something totally and completely
>> different.  It's job is to verify that the kernel to be booted comes
>> from a trusted source.   The sha256 verification in purgatory's job
>> is to verify that memory the kernel cares about was not corrupted
>> during the kexec process.
>>
> 
> Thanks, acknowledged.
> 
>> I believe when you say purgatory you are really talking about that
>> sha256 checksum.  It really is not possible to disable all of
>> the code that runs between kernels, as the old and the new kernel may
>> run at the same addresses.  Anything that runs between the two kernels
>> is what is referred to as purgatory.  Even if it is just a small
>> assembly stub.
>>
>  >
> 
> With this patch, i am trying to give an option (with default purgatory 
> enabled)to disable purgatory completely on x86 only, i.e. no code 
> running between 2 kernels on x86. From my understanding there is no 
> purgatory in arm64 kexec_file case, in riscv and in some other archs as 
> well, so I am not sure why its not possible to disable purgatory (all 
> code running between 2 kernels) in x86? Unless i misunderstood something 
> about the working of other platforms? In x86 case, from what i see 
> relocate_kernel is still part of the older kernel and not purgatory, and 
> with my patch, purgatory.ro is not built and kexec does execute 
> successfully with purgatory disabled.
> 
> About your point for the old and the new kernel may run at the same 
> addresses,i dont think that can happen as in bzImage64_load function its 
> loaded using the kbuf struct. This doesnt happen in other architectures 
> (for e.g. arm64) that dont implement purgatory and any of the tests I 
> conducted with the patch applied due to the use of kbuf struct.
> 
>  From what i see in the code, purgatory in x86 specifically, has 2 main 
> functions, sha256 verification and loading the %rsi register for 
> bootparam_load_addr. In this patch purgatory was disabled, which 
> resulted in sha256 verification disabled and bootparam_load_addr moved 
> to relocate_kernel. Our analysis revealed that most of the time is spent 
> in the sha256 verification, so I would be ok to reformat the patch
> to make the sha verification optional and keep purgatory enabled if 
> thats preferred? Although in my opinion the current patch of only 
> providing an option to disable purgatory seems much better.
> 
> 
>> That sha256 verification is always needed for kexec on panic, there are
>> by the nature of a kernel panic too many unknowns to have any confidence
>> the new kernel will not be corrupted in the process of kexec before it
>> gets started.
>>
>> For an ordinary kexec it might be possible to say that you have a
>> reliable kernel shutdown process and you know for a fact that something
>> won't come along and corrupt the kernel.  I find that a questionable
>> assertion.  I haven't seen anyone yet whose focus when getting an
>> ordinary kexec to work as anything other than making certain all of the
>> drivers are shutdown properly.
>>
>> I have seen countless times when a network packet comes in a the wrong
>> time and the target kernel's memory is corrupted before it gets far
>> enough to initialize the network driver. >
> 
> I agree that when doing testing and research, there are things that can
> go wrong during kexec process, but i assume that the expectation is that
> in production environment, when for e.g. updating a kernel with kexec in 
> servers,the expectation is that kexec will execute successfully as long 
> as there is nothing wrong in the old kernel and the new kernel.
> 
> Maybe I didn't understand this correctly, but if a network packet comes 
> in and corrupts the kernel memory, wont it also cause a problem even 
> when purgatory is enabled? I agree that issue like these would be caught 
> earlier with purgatory, but then if something like this breaks the kexec 
> process in a production environment where the old and new kernel are 
> well tested, trusted and expected to work, wouldn't there be a much more 
> fundamental issue with the reliability of the current kexec process.
> 
>> For a 0.2s speed up you are talking about disabling all of the safety
>> checks in a very dangerous situation.  How much can you can in
>> performance by optimizing the sha256 implementation instead of using
>> what is essentially a reference implementation in basic C that I copied
>> from somewhere long ago.
>>
>> Optimize the sha256 implementation and the memory copy loop and then
>> show how the tiny bit of time that is left is on a mission critical path
>> and must be removed.  Then we can reasonably talk about a config option
>> for disabling the sha256 implementation in the kexec in not-panic case.
>>
> Thanks for this, I assume that over here you are suggesting for e.g. in 
> x86 to replace the existing sha verification implementation in purgatory 
> with the one in arch/x86/crypto/sha256_ssse3_glue.c? Also could you 
> point to the memory copy loop to optimize? I can have a look at these 2 
> options.
> 
> Even with these optimizations, i think the v2 patch should still be 
> considered. The patch in the end is providing only an option to disable 
> purgatory on x86 only, with the default value of keeping it enabled and 
> not changing kexec behaviour. The CONFIG already existed, it is just 
> renamed and now provides user the option to select to disable purgatory 
> for x86, rather than it being architecture dependent.
> 
> Thanks again for the review and comments,
> Usama
> 
> 
>> That sha256 implementation in part so that we can all sleep at night
>> because we don't have to deal with very very strange heizenbugs.
>>
>> Eric
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ