lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <90477952-fde2-41d7-8ff4-2102c45e341d@redhat.com>
Date: Thu, 1 Aug 2024 08:49:27 +0200
From: David Hildenbrand <david@...hat.com>
To: "Yin, Fengwei" <fengwei.yin@...el.com>,
 kernel test robot <oliver.sang@...el.com>, Peter Xu <peterx@...hat.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
 Andrew Morton <akpm@...ux-foundation.org>,
 Huacai Chen <chenhuacai@...nel.org>, Jason Gunthorpe <jgg@...dia.com>,
 Matthew Wilcox <willy@...radead.org>, Nathan Chancellor <nathan@...nel.org>,
 Ryan Roberts <ryan.roberts@....com>, WANG Xuerui <kernel@...0n.name>,
 linux-mm@...ck.org, ying.huang@...el.com, feng.tang@...el.com
Subject: Re: [linus:master] [mm] c0bff412e6: stress-ng.clone.ops_per_sec -2.9%
 regression

On 01.08.24 08:39, Yin, Fengwei wrote:
> Hi David,
> 
> On 7/30/2024 4:11 PM, David Hildenbrand wrote:
>> On 30.07.24 07:00, kernel test robot wrote:
>>>
>>>
>>> Hello,
>>>
>>> kernel test robot noticed a -2.9% regression of
>>> stress-ng.clone.ops_per_sec on:
>>
>> Is that test even using hugetlb? Anyhow, this pretty much sounds like
>> noise and can be ignored.
>>
> It's not about hugetlb. It looks like related with the change:

Ah, that makes sense!

> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 888353c209c03..7577fe7debafc 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -1095,7 +1095,12 @@ PAGEFLAG(Isolated, isolated, PF_ANY);
>    static __always_inline int PageAnonExclusive(const struct page *page)
>    {
>           VM_BUG_ON_PGFLAGS(!PageAnon(page), page);
> -       VM_BUG_ON_PGFLAGS(PageHuge(page) && !PageHead(page), page);
> +       /*
> +        * HugeTLB stores this information on the head page; THP keeps
> it per
> +        * page
> +        */
> +       if (PageHuge(page))
> +               page = compound_head(page);
>           return test_bit(PG_anon_exclusive, &PF_ANY(page, 1)->flags);
> 
> 
> The PageAnonExclusive() function is changed. And the profiling data
> showed it:
> 
>         0.00            +3.9        3.90
> perf-profile.calltrace.cycles-pp.folio_try_dup_anon_rmap_ptes.copy_present_ptes.copy_pte_range.copy_p4d_range.copy_page_range
> 
> According
> https://download.01.org/0day-ci/archive/20240730/202407301049.5051dc19-oliver.sang@intel.com/config-6.9.0-rc4-00197-gc0bff412e67b:
> 	# CONFIG_DEBUG_VM is not set
> So maybe such code change could bring difference?

Yes indeed. fork() can be extremely sensitive to each added instruction.

I even pointed out to Peter why I didn't add the PageHuge check in there 
originally [1].

"Well, and I didn't want to have runtime-hugetlb checks in
PageAnonExclusive code called on certainly-not-hugetlb code paths."


We now have to do a page_folio(page) and then test for hugetlb.

	return folio_test_hugetlb(page_folio(page));

Nowadays, folio_test_hugetlb() will be faster than at c0bff412e6 times, 
so maybe at least part of the overhead is gone.


[1] 
https://lore.kernel.org/r/all/8b0b24bb-3c38-4f27-a2c9-f7d7adc4a115@redhat.com/


-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ