lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 28 Sep 2023 19:29:53 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Baoquan He <bhe@...hat.com>,
        Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>
Cc:     tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
        ebiederm@...ssion.com, akpm@...ux-foundation.org,
        stanislav.kinsburskii@...il.com, corbet@....net,
        linux-kernel@...r.kernel.org, kexec@...ts.infradead.org,
        linux-mm@...ck.org, kys@...rosoft.com, jgowans@...zon.com,
        wei.liu@...nel.org, arnd@...db.de, gregkh@...uxfoundation.org,
        graf@...zon.de, pbonzini@...hat.com
Subject: Re: [RFC PATCH v2 0/7] Introduce persistent memory pool

On 28.09.23 12:25, Baoquan He wrote:
> On 09/27/23 at 09:13am, Stanislav Kinsburskii wrote:
>> On Wed, Sep 27, 2023 at 01:44:38PM +0800, Baoquan He wrote:
>>> Hi Stanislav,
>>>
>>> On 09/25/23 at 02:27pm, Stanislav Kinsburskii wrote:
>>>> This patch introduces a memory allocator specifically tailored for
>>>> persistent memory within the kernel. The allocator maintains
>>>> kernel-specific states like DMA passthrough device states, IOMMU state, and
>>>> more across kexec.
>>>
>>> Can you give more details about how this persistent memory pool will be
>>> utilized in a actual scenario? I mean, what problem have you met so that
>>> you have to introduce persistent memory pool to solve it?
>>>
>>
>> The major reason we have at the moment, is that Linux root partition
>> running on top of the Microsoft hypervisor needs to deposit pages to
>> hypervisor in runtime, when hypervisor runs out of memory.
>> "Depositing" here means, that Linux passes a set of its PFNs to the
>> hypervisor via hypercall, and hypervisor then uses these pages for its
>> own needs.
>>
>> Once deposited, these pages can't be accessed by Linux anymore and thus
>> must be preserved in "used" state across kexec, as hypervisor state is
>> unware of kexec. In the same time, these pages can we withdrawn when
>> usused. Thus, an allocator persistent across kexec looks reasonable for
>> this particular matter.
> 
> Thanks for these details.
>   
> The deposit and withdraw remind me the Balloon driver, David's virtio-mem,
> DLPAR on ppc which can hot increasing or shrinking phisical memory on guest
> OS. Can't microsoft hypervisor do the similar thing to reclaim or give
> back the memory from or to the 'Linux root partition' running on top of
> the hypervisor?

virtio-mem was designed with kexec support in mind. You only expose the 
initial memory to the second kernel, and that memory can never have such 
holes. That does not apply to memory ballooning implementations, like 
Hyper-V dynamic memory.

In the virtio-mem paper I have the following:

"In our experiments, Hyper-V VMs crashed reliably when
trying to use kexec under Linux for fast OS reboots with
an inflated balloon. Other memory ballooning mechanisms
either have to temporarily deflate the whole balloon or al-
low access to inflated memory, which is undesired in cloud
environments."

I remember XEN does something elaborate, whereby they allow access to 
all inflated memory during reboot, but limit the total number of pages 
they will hand out. IIRC, you then have to work around things like 
"Windows initializes all memory with 0s when booting, and cope with 
that". So there are ways how hypervisors handled that in the past.

-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ