linux-kernel - Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 8 Sep 2022 21:44:44 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Hugh Dickins <hughd@...gle.com>
Cc:     Chao Peng <chao.p.peng@...ux.intel.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org,
        linux-doc@...r.kernel.org, qemu-devel@...gnu.org,
        linux-kselftest@...r.kernel.org,
        Paolo Bonzini <pbonzini@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Sean Christopherson <seanjc@...gle.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
        Jeff Layton <jlayton@...nel.org>,
        "J . Bruce Fields" <bfields@...ldses.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shuah Khan <shuah@...nel.org>, Mike Rapoport <rppt@...nel.org>,
        Steven Price <steven.price@....com>,
        "Maciej S . Szmigiero" <mail@...iej.szmigiero.name>,
        Vlastimil Babka <vbabka@...e.cz>,
        Vishal Annapurve <vannapurve@...gle.com>,
        Yu Zhang <yu.c.zhang@...ux.intel.com>, jun.nakajima@...el.com,
        dave.hansen@...el.com, ak@...ux.intel.com, david@...hat.com,
        aarcange@...hat.com, ddutile@...hat.com, dhildenb@...hat.com,
        Quentin Perret <qperret@...gle.com>,
        Michael Roth <michael.roth@....com>, mhocko@...e.com,
        Muchun Song <songmuchun@...edance.com>,
        "Gupta, Pankaj" <pankaj.gupta@....com>
Subject: Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM
 guest private memory

On 8/18/22 06:24, Kirill A . Shutemov wrote:
> On Wed, Aug 17, 2022 at 10:40:12PM -0700, Hugh Dickins wrote:
>> On Wed, 6 Jul 2022, Chao Peng wrote:
>>> This is the v7 of this series which tries to implement the fd-based KVM
>>> guest private memory.
>>
>> Here at last are my reluctant thoughts on this patchset.
>>
>> fd-based approach for supporting KVM guest private memory: fine.
>>
>> Use or abuse of memfd and shmem.c: mistaken.
>>
>> memfd_create() was an excellent way to put together the initial prototype.
>>
>> But since then, TDX in particular has forced an effort into preventing
>> (by flags, seals, notifiers) almost everything that makes it shmem/tmpfs.
>>
>> Are any of the shmem.c mods useful to existing users of shmem.c? No.
>> Is MFD_INACCESSIBLE useful or comprehensible to memfd_create() users? No.
>>
>> What use do you have for a filesystem here?  Almost none.
>> IIUC, what you want is an fd through which QEMU can allocate kernel
>> memory, selectively free that memory, and communicate fd+offset+length
>> to KVM.  And perhaps an interface to initialize a little of that memory
>> from a template (presumably copied from a real file on disk somewhere).
>>
>> You don't need shmem.c or a filesystem for that!
>>
>> If your memory could be swapped, that would be enough of a good reason
>> to make use of shmem.c: but it cannot be swapped; and although there
>> are some references in the mailthreads to it perhaps being swappable
>> in future, I get the impression that will not happen soon if ever.
>>
>> If your memory could be migrated, that would be some reason to use
>> filesystem page cache (because page migration happens to understand
>> that type of memory): but it cannot be migrated.
> 
> Migration support is in pipeline. It is part of TDX 1.5 [1]. And swapping
> theoretically possible, but I'm not aware of any plans as of now.
> 
> [1] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html
> 

This thing?

https://cdrdv2.intel.com/v1/dl/getContent/733578

That looks like migration between computers, not between NUMA nodes.  Or 
am I missing something?