linux-kernel - Re: [RFC PATCH] mm: support large folio numa balancing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a3a54a24-76bc-419f-9251-d6ae1355b3b6@linux.alibaba.com>
Date:   Mon, 20 Nov 2023 11:28:11 +0800
From:   Baolin Wang <baolin.wang@...ux.alibaba.com>
To:     David Hildenbrand <david@...hat.com>, akpm@...ux-foundation.org
Cc:     ying.huang@...el.com, wangkefeng.wang@...wei.com,
        willy@...radead.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mm: support large folio numa balancing



On 11/15/2023 6:47 PM, David Hildenbrand wrote:
> On 15.11.23 11:46, David Hildenbrand wrote:
>> On 13.11.23 11:45, Baolin Wang wrote:
>>> Currently, the file pages already support large folio, and supporting 
>>> for
>>> anonymous pages is also under discussion[1]. Moreover, the numa 
>>> balancing
>>> code are converted to use a folio by previous thread[2], and the 
>>> migrate_pages
>>> function also already supports the large folio migration.
>>>
>>> So now I did not see any reason to continue restricting NUMA 
>>> balancing for
>>> large folio.
>>>
>>> [1] https://lkml.org/lkml/2023/9/29/342
>>> [2] 
>>> https://lore.kernel.org/all/20230921074417.24004-4-wangkefeng.wang@huawei.com/T/#md9d10fe34587229a72801f0d731f7457ab3f4a6e
>>> Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
>>> ---
>>
>> I'll note that another piece is missing, and I'd be curious how you
>> tested your patch set or what I am missing. (no anonymous pages?)

I tested it with file large folio (order = 4) created by XFS filesystem.

>> change_pte_range() contains:
>>
>> if (prot_numa) {
>>     ...
>>     /* Also skip shared copy-on-write pages */
>>     if (is_cow_mapping(vma->vm_flags) &&
>>         folio_ref_count(folio) != 1)
>>         continue;
>>
>> So we'll never end up mapping an anon PTE-mapped THP prot-none (well, 
>> unless a
>> single PTE remains) and consequently never trigger NUMA hinting faults.
>>
>> Now, that change has some history [1], but the original problem has been
>> sorted out in the meantime. But we should consider Linus' original 
>> feedback.
>>
>> For pte-mapped THP, we might want to do something like the following
>> (completely untested):

Thanks for pointing out. I have not tried pte-mapped THP yet, and will 
look at it in detail.

>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 81991102f785..c4e6b9032e40 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -129,7 +129,8 @@ static long change_pte_range(struct mmu_gather *tlb,
>>                                   /* Also skip shared copy-on-write 
>> pages */
>>                                   if (is_cow_mapping(vma->vm_flags) &&
>> -                                   folio_ref_count(folio) != 1)
>> +                                   (folio_maybe_dma_pinned(folio) ||
>> +                                    folio_estimated_sharers(folio) != 
>> 1))
> 
> Actually, > 1 might be better if the first subpage is not mapped; it's a 
> mess.
>