linux-kernel - Re: [PATCH RFC 2/3] mm: do_swap_page: use folio_add_new_anon_rmap() if folio_test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6ac8ff4d-7099-4547-9e76-528d14f171d8@redhat.com>
Date: Thu, 13 Jun 2024 12:37:21 +0200
From: David Hildenbrand <david@...hat.com>
To: Barry Song <21cnbao@...il.com>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org, chrisl@...nel.org,
 linux-kernel@...r.kernel.org, mhocko@...e.com, ryan.roberts@....com,
 baolin.wang@...ux.alibaba.com, yosryahmed@...gle.com, shy828301@...il.com,
 surenb@...gle.com, v-songbaohua@...o.com, willy@...radead.org,
 ying.huang@...el.com, yuzhao@...gle.com
Subject: Re: [PATCH RFC 2/3] mm: do_swap_page: use folio_add_new_anon_rmap()
 if folio_test_anon(folio)==false

On 13.06.24 11:58, Barry Song wrote:
> On Thu, Jun 13, 2024 at 9:23 PM David Hildenbrand <david@...hat.com> wrote:
>>
>> On 13.06.24 02:07, Barry Song wrote:
>>> From: Barry Song <v-songbaohua@...o.com>
>>>
>>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap()
>>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will
>>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating
>>> the process of bringing up mTHP swapin.
>>>
>>>    static __always_inline void __folio_add_anon_rmap(struct folio *folio,
>>>                    struct page *page, int nr_pages, struct vm_area_struct *vma,
>>>                    unsigned long address, rmap_t flags, enum rmap_level level)
>>>    {
>>>            ...
>>>            if (unlikely(!folio_test_anon(folio))) {
>>>                    VM_WARN_ON_FOLIO(folio_test_large(folio) &&
>>>                                     level != RMAP_LEVEL_PMD, folio);
>>>            }
>>>            ...
>>>    }
>>>
>>> It also enhances the code’s readability.
>>>
>>> Suggested-by: David Hildenbrand <david@...hat.com>
>>> Signed-off-by: Barry Song <v-songbaohua@...o.com>
>>> ---
>>>    mm/memory.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/mm/memory.c b/mm/memory.c
>>> index 2f94921091fb..9c962f62f928 100644
>>> --- a/mm/memory.c
>>> +++ b/mm/memory.c
>>> @@ -4339,6 +4339,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>>        if (unlikely(folio != swapcache && swapcache)) {
>>>                folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE);
>>>                folio_add_lru_vma(folio, vma);
>>> +     } else if (!folio_test_anon(folio)) {
>>> +             folio_add_new_anon_rmap(folio, vma, address, rmap_flags);
>>
>> So, with large folio swapin, we would never end up here if any swp PTE
>> is !exclusive, because we would make sure to allocate a large folio only
>> for suitable regions, correct?
>>
>> It can certainly happen during swapout + refault with large folios. But
>> there, we will always have folio_test_anon() still set and wouldn't run
>> into this code path.
>>
>> (it will all be better with per-folio PAE bit, but that will take some
>> more time until I actually get to implement it properly, handling all
>> the nasty corner cases)
>>
>> Better add a comment regarding why we are sure that the complete thing
>> is exclusive (e.g., currently only small folios).
> 
> No, patch 1/3 enables both cases to call folio_add_new_anon_rmap(). Currently,
> small folios could be non-exclusive. I suppose your question is
> whether we should
> ensure that all pages within a mTHP have the same exclusivity status,
> rather than
> always being exclusive?

We must never end up calling folio_add_new_anon_rmap(folio, vma, 
address, RMAP_EXCLUSIVE) if part of the folio is non-exclusive.

I think we *may* call folio_add_new_anon_rmap(folio, vma, address, 0) if 
part of the folio is exclusive [it go swapped out, so there cannot be 
GUP references]. But, to be future proof (single PAE bit per folio), we 
better avoid that on swapin if possible.

For small folios, it's clear that it cannot be partially exclusive 
(single page).

We better comment why we obey to the above. Like

"We currently ensure that new folios cannot be partially exclusive: they 
are either fully exclusive or fully shared."

-- 
Cheers,

David / dhildenb