linux-kernel - Re: [PATCH v2 0/9] support large folio swap-out and swap-in for shmem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a14632e8-6f60-43f0-b80d-fea98491975c@redhat.com>
Date: Thu, 20 Jun 2024 08:41:51 +0200
From: David Hildenbrand <david@...hat.com>
To: Hugh Dickins <hughd@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>
Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>, willy@...radead.org,
 wangkefeng.wang@...wei.com, chrisl@...nel.org, ying.huang@...el.com,
 21cnbao@...il.com, ryan.roberts@....com, shy828301@...il.com,
 ziy@...dia.com, ioworker0@...il.com, da.gomez@...sung.com,
 p.raghav@...sung.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 Andrew Bresticker <abrestic@...osinc.com>
Subject: Re: [PATCH v2 0/9] support large folio swap-out and swap-in for shmem

On 20.06.24 05:59, Hugh Dickins wrote:
> On Wed, 19 Jun 2024, Andrew Morton wrote:
>> On Wed, 19 Jun 2024 01:16:42 -0700 (PDT) Hugh Dickins <hughd@...gle.com> wrote:
>>> On Wed, 19 Jun 2024, Baolin Wang wrote:
>>>> On 2024/6/19 04:05, Andrew Morton wrote:
>>>>> On Tue, 18 Jun 2024 14:54:12 +0800 Baolin Wang
>>>>> <baolin.wang@...ux.alibaba.com> wrote:
>>>>>
>>>>>> Shmem will support large folio allocation [1] [2] to get a better
>>>>>> performance,
>>>>>> however, the memory reclaim still splits the precious large folios when
>>>>>> trying
>>>>>> to swap-out shmem, which may lead to the memory fragmentation issue and can
>>>>>> not
>>>>>> take advantage of the large folio for shmeme.
>>>>>>
>>>>>> Moreover, the swap code already supports for swapping out large folio
>>>>>> without
>>>>>> split, and large folio swap-in[3] series is queued into mm-unstable branch.
>>>>>> Hence this patch set also supports the large folio swap-out and swap-in for
>>>>>> shmem.
>>>>>
>>>>> I'll add this to mm-unstable for some exposure, but I wonder how much
>>>>> testing it will have recieved by the time the next merge window opens?
>>>>
>>>> Thanks Andrew. I am fine with this series going to 6.12 if you are concerned
>>>> about insufficient testing (and let's also wait for Hugh's comments). Since we
>>>> (Daniel and I) have some follow-up patches that will rely on this swap series,
>>>> hope this series can be tested as extensively as possible to ensure its
>>>> stability in the mm branch.
>>>
>>> Thanks for giving it the exposure, Andrew, but please drop it from
>>> mm-unstable until the next cycle.
>>
>> Thanks, dropped.
> 
> Thanks. I'll add a little more info in other mail, against the further
> 2024-06-18 problems I reported, but tl;dr is they are still a mystery:
> I cannot yet say "drop this" or "drop that" or "here's a fix".
> 
>>
>>> p.s. I think Andrew Bresticker's do_set_pmd() fix has soaked
>>> long enough, and deserves promotion to hotfix and Linus soon.
>>
>> Oh, OK, done.
>>
>> And it's cc:stable.  I didn't get any sens of urgency for this one -
>> what is your thinking here?
> 
> I thought you were right to add the cc:stable. The current v6.8..v6.10
> state does not result in any crashes or warnings, but it totally (well,
> 511 times out of 512, in some workloads anyway) defeats the purpose of
> shmem+file huge pages - the kernel is going to all the trouble of trying
> to allocate those huge pages, but then refuses to map them by PMD unless
> the fault happens to occur within the first 4096 bytes (talking x86_64).
> 
> I imagine that this causes a significant performance degradation in
> some workloads which ask for and are used to getting huge pages there;
> and they might also exceed their memory quotas, since a page table has
> to be allocated where a PMD-mapping needs none (anon THPs reserve a page
> table anyway, to rely on later if splitting, but shmem+file THPs do not).
> And it's surprising that no tests were reported as failing.

Exactly my thinking. Either lack of tests or it doesn't really happen 
that often where khugepaged doesn't fix it up.

After all it's been two kernel releases ....

-- 
Cheers,

David / dhildenb