lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <41c1bd1f-b1d7-4faf-a422-1eff7425b35c@redhat.com>
Date: Tue, 7 May 2024 12:39:20 +0200
From: David Hildenbrand <david@...hat.com>
To: Barry Song <21cnbao@...il.com>
Cc: Ryan Roberts <ryan.roberts@....com>, akpm@...ux-foundation.org,
 linux-mm@...ck.org, baolin.wang@...ux.alibaba.com, chrisl@...nel.org,
 hanchuanhua@...o.com, hannes@...xchg.org, hughd@...gle.com,
 kasong@...cent.com, linux-kernel@...r.kernel.org, surenb@...gle.com,
 v-songbaohua@...o.com, willy@...radead.org, xiang@...nel.org,
 ying.huang@...el.com, yosryahmed@...gle.com, yuzhao@...gle.com,
 ziy@...dia.com
Subject: Re: [PATCH v3 6/6] mm: swap: entirely map large folios found in
 swapcache

On 07.05.24 11:24, Barry Song wrote:
> On Tue, May 7, 2024 at 8:59 PM David Hildenbrand <david@...hat.com> wrote:
>>
>>>> Let's assume a single subpage of a large folio is no longer mapped.
>>>> Then, we'd have:
>>>>
>>>> nr_pages == folio_nr_pages(folio) - 1.
>>>>
>>>> You could simply map+reuse most of the folio without COWing.
>>>
>>> yes. This is good but the pte which is no longer mapped could be
>>> anyone within the nr_pages PTEs. so it could be quite tricky for
>>> set_ptes.
>>
>> The swap batching logic should take care of that, otherwise it would be
>> buggy.
> 
> When you mention "it would be buggy," are you also referring to the current
> fallback approach? or only refer to the future patch which might be able
> to map/reuse "nr_pages - 1" pages?

swap_pte_batch() should not skip any holes. So consequently, set_ptes() 
should do the right thing. (regarding your comment "could be quite ricky 
for set_ptes")

So I think that should be working as expected.

> 
> The current patch falls back to setting nr_pages = 1 without mapping or
> reusing nr_pages - 1. I feel your concern doesn't refer to this fallback?
> 
>>
>>>
>>>>
>>>> Once we support COW reuse of PTE-mapped THP we'd do the same. Here, it's
>>>> just easy to detect that the folio is exclusive (folio_ref_count(folio)
>>>> == 1 before mapping anything).
>>>>
>>>> If you really want to mimic what do_wp_page() currently does, you should
>>>> have:
>>>>
>>>> exclusive || (folio_ref_count(folio) == 1 && !folio_test_large(folio))
>>>
>>> I actually dislike the part that do_wp_page() handles the reuse of a large
>>> folio which is entirely mapped. For example, A forks B, B exit, we write
>>> A's large folio, we get nr_pages CoW of small folios. Ideally, we can
>>> reuse the whole folios for writing.
>>
>> Yes, see the link I shared to what I am working on. There isn't really a
>> question if what we do right now needs to be improved and all these
>> scenarios are pretty obvious clear.
> 
> Great! I plan to dedicate more time to reviewing your work.

Nice! And there will be a lot of follow-up optimization work I won't 
tackle immediately regarding COW (COW-reuse around, maybe sometimes we 
want to COW bigger chunks).

I still have making PageAnonExclusive a per-folio flag on my TODO list, 
that will help the COW-reuse around case a lot.

> 
>>
>>>
>>>>
>>>> Personally, I think we should keep it simple here and use:
>>>>
>>>> exclusive || folio_ref_count(folio) == 1
>>>
>>> I feel this is still better than
>>> exclusive || (folio_ref_count(folio) == 1 && !folio_test_large(folio))
>>> as we reuse the whole large folio. the do_wp_page() behaviour
>>> doesn't have this.
>>
>> Yes, but there is the comment "Same logic as in do_wp_page();". We
>> already ran into issues having different COW reuse logic all over the
>> place. For this case here, I don't care if we leave it as
>>
>> "exclusive || folio_ref_count(folio) == 1"
> 
> I'm perfectly fine with using the code for this patchset and maybe
> looking for other
> opportunities for potential optimization as an incremental patchset,
> for example,
> reusing the remaining PTEs as suggested by you -  "simply map+reuse most of
> the folio without COWing."
> 
>>
>> But let's not try inventing new stuff here.
> 
> It seems you ignored and snipped my "16 + 14" pages and "15" pages
> example though. but once we support "simply map+reuse most of the
> folio without COWing", the "16+14" problem can be resolved, instead,
> we consume 16 pages.


Oh, sorry for skipping that, for me it was rather clear: the partially 
mapped folios will be on the deferred split list and the excess memory 
can (and will be) reclaimed when there is need. So this temporary memory 
consumption is usually not a problem in practice. But yes, something to 
optimize (just like COW reuse in general).

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ