[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aS8HaDX5Pg9h_nkl@x1.local>
Date: Tue, 2 Dec 2025 10:36:08 -0500
From: Peter Xu <peterx@...hat.com>
To: Nikita Kalyazin <kalyazin@...zon.com>
Cc: "David Hildenbrand (Red Hat)" <david@...nel.org>,
Mike Rapoport <rppt@...nel.org>, linux-mm@...ck.org,
Andrea Arcangeli <aarcange@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Hugh Dickins <hughd@...gle.com>,
James Houghton <jthoughton@...gle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Michal Hocko <mhocko@...e.com>, Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Shuah Khan <shuah@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v3 4/5] guest_memfd: add support for userfaultfd minor
mode
On Tue, Dec 02, 2025 at 11:50:31AM +0000, Nikita Kalyazin wrote:
> > It looks fine indeed, but it looks slightly weird then, as you'll have two
> > ways to populate the page cache. Logically here atomicity is indeed not
> > needed when you trap both MISSING + MINOR.
>
> I reran the test based on the UFFDIO_COPY prototype I had using your series
> [2], and UFFDIO_COPY is slower than write() to populate 512 MiB: 237 vs 202
> ms (+17%). Even though UFFDIO_COPY alone is functionally sufficient, I
> would prefer to have an option to use write() where possible and only
> falling back to UFFDIO_COPY for userspace faults to have better performance.
Yes, write() should be fine.
Especially to gmem, I guess write() support is needed when VMAs cannot be
mapped at all in strict CoCo context, so it needs to be available one way
or another.
IIUC it's because UFFDIO_COPY (or memcpy(), I recall you used to test that
instead) will involve pgtable operations. So I wonder if the VMA mapping
the gmem will still be accessed at some point later (either private->share
convertable ones for device DMAs for CoCo, or fully shared non-CoCo use
case), then the pgtable overhead will happen later for a write()-styled
fault resolution.
>From that POV, above number makes sense.
Thanks for the extra testing results.
>
> [2]
> https://lore.kernel.org/all/7666ee96-6f09-4dc1-8cb2-002a2d2a29cf@amazon.com
--
Peter Xu
Powered by blists - more mailing lists