[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <90477952-fde2-41d7-8ff4-2102c45e341d@redhat.com>
Date: Thu, 1 Aug 2024 08:49:27 +0200
From: David Hildenbrand <david@...hat.com>
To: "Yin, Fengwei" <fengwei.yin@...el.com>,
kernel test robot <oliver.sang@...el.com>, Peter Xu <peterx@...hat.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Huacai Chen <chenhuacai@...nel.org>, Jason Gunthorpe <jgg@...dia.com>,
Matthew Wilcox <willy@...radead.org>, Nathan Chancellor <nathan@...nel.org>,
Ryan Roberts <ryan.roberts@....com>, WANG Xuerui <kernel@...0n.name>,
linux-mm@...ck.org, ying.huang@...el.com, feng.tang@...el.com
Subject: Re: [linus:master] [mm] c0bff412e6: stress-ng.clone.ops_per_sec -2.9%
regression
On 01.08.24 08:39, Yin, Fengwei wrote:
> Hi David,
>
> On 7/30/2024 4:11 PM, David Hildenbrand wrote:
>> On 30.07.24 07:00, kernel test robot wrote:
>>>
>>>
>>> Hello,
>>>
>>> kernel test robot noticed a -2.9% regression of
>>> stress-ng.clone.ops_per_sec on:
>>
>> Is that test even using hugetlb? Anyhow, this pretty much sounds like
>> noise and can be ignored.
>>
> It's not about hugetlb. It looks like related with the change:
Ah, that makes sense!
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 888353c209c03..7577fe7debafc 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -1095,7 +1095,12 @@ PAGEFLAG(Isolated, isolated, PF_ANY);
> static __always_inline int PageAnonExclusive(const struct page *page)
> {
> VM_BUG_ON_PGFLAGS(!PageAnon(page), page);
> - VM_BUG_ON_PGFLAGS(PageHuge(page) && !PageHead(page), page);
> + /*
> + * HugeTLB stores this information on the head page; THP keeps
> it per
> + * page
> + */
> + if (PageHuge(page))
> + page = compound_head(page);
> return test_bit(PG_anon_exclusive, &PF_ANY(page, 1)->flags);
>
>
> The PageAnonExclusive() function is changed. And the profiling data
> showed it:
>
> 0.00 +3.9 3.90
> perf-profile.calltrace.cycles-pp.folio_try_dup_anon_rmap_ptes.copy_present_ptes.copy_pte_range.copy_p4d_range.copy_page_range
>
> According
> https://download.01.org/0day-ci/archive/20240730/202407301049.5051dc19-oliver.sang@intel.com/config-6.9.0-rc4-00197-gc0bff412e67b:
> # CONFIG_DEBUG_VM is not set
> So maybe such code change could bring difference?
Yes indeed. fork() can be extremely sensitive to each added instruction.
I even pointed out to Peter why I didn't add the PageHuge check in there
originally [1].
"Well, and I didn't want to have runtime-hugetlb checks in
PageAnonExclusive code called on certainly-not-hugetlb code paths."
We now have to do a page_folio(page) and then test for hugetlb.
return folio_test_hugetlb(page_folio(page));
Nowadays, folio_test_hugetlb() will be faster than at c0bff412e6 times,
so maybe at least part of the overhead is gone.
[1]
https://lore.kernel.org/r/all/8b0b24bb-3c38-4f27-a2c9-f7d7adc4a115@redhat.com/
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists