linux-kernel - Re: [RFC PATCH] mm: support large folio numa balancing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6f953202-b29c-4274-943f-f1a93b1b6ea5@huawei.com>
Date:   Tue, 14 Nov 2023 21:12:51 +0800
From:   Kefeng Wang <wangkefeng.wang@...wei.com>
To:     David Hildenbrand <david@...hat.com>,
        John Hubbard <jhubbard@...dia.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        <akpm@...ux-foundation.org>
CC:     <ying.huang@...el.com>, <willy@...radead.org>,
        <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] mm: support large folio numa balancing



On 2023/11/14 19:35, David Hildenbrand wrote:
> On 13.11.23 23:15, John Hubbard wrote:
>> On 11/13/23 5:01 AM, Baolin Wang wrote:
>>>
>>>
>>> On 11/13/2023 8:10 PM, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2023/11/13 18:53, David Hildenbrand wrote:
>>>>> On 13.11.23 11:45, Baolin Wang wrote:
>>>>>> Currently, the file pages already support large folio, and
>>>>>> supporting for
>>>>>> anonymous pages is also under discussion[1]. Moreover, the numa
>>>>>> balancing
>>>>>> code are converted to use a folio by previous thread[2], and the
>>>>>> migrate_pages
>>>>>> function also already supports the large folio migration.
>>>>>>
>>>>>> So now I did not see any reason to continue restricting NUMA
>>>>>> balancing for
>>>>>> large folio.
>>>>>
>>>>> I recall John wanted to look into that. CCing him.
>>>>>
>>>>> I'll note that the "head page mapcount" heuristic to detect sharers 
>>>>> will
>>>>> now strike on the PTE path and make us believe that a large folios is
>>>>> exclusive, although it isn't.
>>>>>
>>>>> As spelled out in the commit you are referencing:
>>>>>
>>>>> commit 6695cf68b15c215d33b8add64c33e01e3cbe236c
>>>>> Author: Kefeng Wang <wangkefeng.wang@...wei.com>
>>>>> Date:   Thu Sep 21 15:44:14 2023 +0800
>>>>>
>>>>>       mm: memory: use a folio in do_numa_page()
>>>>>       Numa balancing only try to migrate non-compound page in
>>>>> do_numa_page(),
>>>>>       use a folio in it to save several compound_head calls, note 
>>>>> we use
>>>>>       folio_estimated_sharers(), it is enough to check the folio
>>>>> sharers since
>>>>>       only normal page is handled, if large folio numa balancing is
>>>>> supported, a
>>>>>       precise folio sharers check would be used, no functional change
>>>>> intended.
>>>>>
>>>>>
>>>>> I'll send WIP patches for one approach that can improve the situation
>>>>> soonish.
>>
>> To be honest, I'm still catching up on the approximate vs. exact
>> sharers case. It wasn't clear to me why a precise sharers count
>> is needed in order to do this. Perhaps the cost of making a wrong
>> decision is considered just too high?
> 
> Good question, I didn't really look into the impact for the NUMA hinting 
> case where we might end up not setting TNF_SHARED although it is shared. 
> For other folio_estimate_sharers() users it's more obvious.

The task_numa_group() will check the TNF_SHARED, if processes share same
page/folio, they will be packed into a single numa group, and the numa
group fault statistic will be used in should_numa_migrate_memory() to
decide whether to migrate or not, if not setting TNF_SHARED, maybe be
lead to more page/folio migration.

> 
> As a side note, it could have happened already in corner cases (e.g., 
> concurrent page migration of a small folio).
> 
> If precision as documented in that commit is really required remains to 
> be seen -- just wanted to spell it out.
>