lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxmRkUNmx863Po2U@yzhao56-desk.sh.intel.com>
Date: Thu, 24 Oct 2024 08:15:13 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: "Kirill A. Shutemov" <kirill@...temov.name>, <kexec@...ts.infradead.org>,
	<linux-kernel@...r.kernel.org>, <linux-coco@...ts.linux.dev>,
	<x86@...nel.org>, <rick.p.edgecombe@...el.com>,
	<kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH] kexec_core: Accept unaccepted kexec destination addresses

On Wed, Oct 23, 2024 at 10:44:11AM -0500, Eric W. Biederman wrote:
> "Kirill A. Shutemov" <kirill@...temov.name> writes:
> 
> > Waiting minutes to get VM booted to shell is not feasible for most
> > deployments. Lazy is sane default to me.
> 
> Huh?
> 
> Unless my guesses about what is happening are wrong lazy is hiding
> a serious implementation deficiency.  From all hardware I have seen
> taking minutes is absolutely ridiculous.
> 
> Does writing to all of memory at full speed take minutes?  How can such
> a system be functional?
> 
> If you don't actually have to write to the pages and it is just some
> accounting function it is even more ridiculous.
> 
> 
> I had previously thought that accept_memory was the firmware call.
> Now that I see that it is just a wrapper for some hardware specific
> calls I am even more perplexed.
> 
> 
> Quite honestly what this looks like to me is that someone failed to
> enable write-combining or write-back caching when writing to memory
> when initializing the protected memory.  With the result that everything
> is moving dog slow, and people are introducing complexity left and write
> to avoid that bad implementation.
> 
> 
> Can someone please explain to me why this accept_memory stuff has to be
> slow, why it has to take minutes to do it's job.
This kexec patch is a fix to a guest(TD)'s kexce failure.

For a linux guest, the accept_memory() happens before the guest accesses a page.
It will (if the guest is a TD)
(1) trigger the host to allocate the physical page on host to map the accessed
    guest page, which might be slow with wait and sleep involved, depending on
    the memory pressure on host.
(2) initializing the protected page.

Actually most of guest memory are not accessed by guest during the guest life
cycle. accept_memory() may cause the host to commit a never-to-be-used page,
with the host physical page not even being able to get swapped out.

That's why we need a lazy accept, which does not accept_memory() until after a
page is allocated by the kernel (in alloc_page(s)).

> I would much rather spend my time figuring out how to make accept_memory
> run at a reasonable speed than to litter the kernel with more of this
> nonsense.
> 
> Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ