linux-kernel - Re: [PATCH mm-unstable 5/5] mm/mempolicy: Convert migrate_page_add() to migrate_folio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <2bcf36c4-54a1-37e4-7a0a-3d3e34ef4e56@intel.com>
Date:   Sat, 21 Jan 2023 11:41:56 +0800
From:   "Yin, Fengwei" <fengwei.yin@...el.com>
To:     Matthew Wilcox <willy@...radead.org>
CC:     "Vishal Moola (Oracle)" <vishal.moola@...il.com>,
        <linux-mm@...ck.org>, <akpm@...ux-foundation.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH mm-unstable 5/5] mm/mempolicy: Convert migrate_page_add()
 to migrate_folio_add()



On 1/21/2023 3:47 AM, Matthew Wilcox wrote:
> On Thu, Jan 19, 2023 at 09:24:16AM +0800, Yin, Fengwei wrote:
>> On 1/19/2023 7:22 AM, Vishal Moola (Oracle) wrote:
>>> @@ -1022,27 +1022,23 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
>>>  }
>>>  
>>>  #ifdef CONFIG_MIGRATION
>>> -/*
>>> - * page migration, thp tail pages can be passed.
>>> - */
>>> -static int migrate_page_add(struct page *page, struct list_head *pagelist,
>>> +static int migrate_folio_add(struct folio *folio, struct list_head *foliolist,
>>>  				unsigned long flags)
>>>  {
>>> -	struct page *head = compound_head(page);
>>>  	/*
>>> -	 * Avoid migrating a page that is shared with others.
>>> +	 * Avoid migrating a folio that is shared with others.
>>>  	 */
>>> -	if ((flags & MPOL_MF_MOVE_ALL) || page_mapcount(head) == 1) {
>>> -		if (!isolate_lru_page(head)) {
>>> -			list_add_tail(&head->lru, pagelist);
>>> -			mod_node_page_state(page_pgdat(head),
>>> -				NR_ISOLATED_ANON + page_is_file_lru(head),
>>> -				thp_nr_pages(head));
>>> +	if ((flags & MPOL_MF_MOVE_ALL) || folio_mapcount(folio) == 1) {
>> One question to the page_mapcount -> folio_mapcount here.
>>
>> For a large folio with 0 entire mapcount, if the first sub-page and any
>> other sub-page are mapped, page_mapcount(head) == 1 is true while
>> folio_mapcount(folio) == 1 is not.
> 
> We had a good discussion about this in today's THP Cabal meeting [1].  I
> didn't quite check everything that I said was true, so let me summarise
> & correct it now ...
> 
>  - This is a heuristic.  We're trying to see whether this folio is
>    mapped by multiple processes (because if it is, it's probably not
>    worth migrating).  If the heuristic is wrong, it probably doesn't
>    matter _too_ much?
Agree.

>  - A proper heuristic for this would be
> 		folio_total_mapcount(folio) == folio_nr_pages(folio)
I am not sure. File folio can be partially mapped. Maybe following check?
  for each sub-pages:
     (folio_entire_mapcount(folio) + sub-pages->_mapcount) <= 1

But it's also expensive to check all sub-pages. Maybe a bit in folio
if filio mapped to only one process is really important?

>    but this would be expensive to calculate as it requires examining
>    512 cachelines for a 2MB page.
>  - For a large folio which is smaller than PMD size, we're guaranteed
>    that folio_mapcount() is 0 today.
My understanding is: for large folio, if any sub-page is mapped,
folio_mapcount() can not be 0.

>  - In the meeting I said that page_mapcount() of the head of a THP
>    page was zero; that's not true; I had forgotten that we added in
>    entire_mapcount to the individual page mapcount.
> 
> so I now think this should be:
> 
> 	page_mapcount(folio_page(folio, 0))
For file large folio, it's possible folio_page(folio, 0) mapped only
once, other sub-pages mapped multiple times.

But I think this maybe the best choice here.

> 
> with an explanation that checking every page is too heavy-weight.
> Maybe it should be its own function:
> 
> static inline int folio_estimated_mapcount(folio)
> {
> 	return page_mapcount(folio_page(folio, 0));
> }
> 
> with a nice comment explaining what's going on.
> 
> [1] https://www.youtube.com/watch?v=A3PoGQQQD3Q is the recording of
> today's meeting.
This is nice. Thanks a lot for sharing.


Regards
Yin, Fengwei