[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5c62bdbb-7a4e-4178-8c03-e84491d8d150@redhat.com>
Date: Mon, 13 Jan 2025 13:20:23 +0100
From: David Hildenbrand <david@...hat.com>
To: kalyazin@...zon.com, willy@...radead.org, pbonzini@...hat.com,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc: michael.day@....com, jthoughton@...gle.com, michael.roth@....com,
ackerleytng@...gle.com, graf@...zon.de, jgowans@...zon.com,
roypat@...zon.co.uk, derekmn@...zon.com, nsaenz@...zon.es,
xmarcalx@...zon.com
Subject: Re: [RFC PATCH 0/2] mm: filemap: add filemap_grab_folios
On 10.01.25 19:54, Nikita Kalyazin wrote:
> On 10/01/2025 17:01, David Hildenbrand wrote:
>> On 10.01.25 16:46, Nikita Kalyazin wrote:
>>> Based on David's suggestion for speeding up guest_memfd memory
>>> population [1] made at the guest_memfd upstream call on 5 Dec 2024 [2],
>>> this adds `filemap_grab_folios` that grabs multiple folios at a time.
>>>
>>
>> Hi,
>
> Hi :)
>
>>
>>> Motivation
>>>
>>> When profiling guest_memfd population and comparing the results with
>>> population of anonymous memory via UFFDIO_COPY, I observed that the
>>> former was up to 20% slower, mainly due to adding newly allocated pages
>>> to the pagecache. As far as I can see, the two main contributors to it
>>> are pagecache locking and tree traversals needed for every folio. The
>>> RFC attempts to partially mitigate those by adding multiple folios at a
>>> time to the pagecache.
>>>
>>> Testing
>>>
>>> With the change applied, I was able to observe a 10.3% (708 to 635 ms)
>>> speedup in a selftest that populated 3GiB guest_memfd and a 9.5% (990 to
>>> 904 ms) speedup when restoring a 3GiB guest_memfd VM snapshot using a
>>> custom Firecracker version, both on Intel Ice Lake.
>>
>> Does that mean that it's still 10% slower (based on the 20% above), or
>> were the 20% from a different micro-benchmark?
>
> Yes, it is still slower:
> - isolated/selftest: 2.3%
> - Firecracker setup: 8.9%
>
> Not sure why the values are so different though. I'll try to find an
> explanation.
The 2.3% looks very promising.
>
>>>
>>> Limitations
>>>
>>> While `filemap_grab_folios` handles THP/large folios internally and
>>> deals with reclaim artifacts in the pagecache (shadows), for simplicity
>>> reasons, the RFC does not support those as it demonstrates the
>>> optimisation applied to guest_memfd, which only uses small folios and
>>> does not support reclaim at the moment.
>>
>> It might be worth pointing out that, while support for larger folios is
>> in the works, there will be scenarios where small folios are unavoidable
>> in the future (mixture of shared and private memory).
>>
>> How hard would it be to just naturally support large folios as well?
>
> I don't think it's going to be impossible. It's just one more dimension
> that needs to be handled. `__filemap_add_folio` logic is already rather
> complex, and processing multiple folios while also splitting when
> necessary correctly looks substantially convoluted to me. So my idea
> was to discuss/validate the multi-folio approach first before rolling
> the sleeves up.
We should likely try making this as generic as possible, meaning we'll
support roughly what filemap_grab_folio() would have supported (e.g., also large folios).
Now I find filemap_get_folios_contig() [thas is already used in memfd code],
and wonder if that could be reused/extended fairly easily.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists