linux-kernel - Re: [linus:master] [mm/rmap] 6af8cb80d3: vm-scalability.throughput 7.8% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <50a55a42-6d79-4e3c-992c-26a96dc12d81@redhat.com>
Date: Wed, 16 Apr 2025 11:16:15 +0200
From: David Hildenbrand <david@...hat.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
 Andrew Morton <akpm@...ux-foundation.org>,
 Andy Lutomirks^H^Hski <luto@...nel.org>, Borislav Betkov <bp@...en8.de>,
 Dave Hansen <dave.hansen@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>,
 Jann Horn <jannh@...gle.com>, Johannes Weiner <hannes@...xchg.org>,
 Jonathan Corbet <corbet@....net>,
 "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 Lance Yang <ioworker0@...il.com>, Liam Howlett <liam.howlett@...cle.com>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Matthew Wilcow <willy@...radead.org>, Michal Koutn <mkoutny@...e.com>,
 Muchun Song <muchun.song@...ux.dev>, tejun heo <tj@...nel.org>,
 Thomas Gleixner <tglx@...utronix.de>, Vlastimil Babka <vbabka@...e.cz>,
 Zefan Li <lizefan.x@...edance.com>, linux-mm@...ck.org
Subject: Re: [linus:master] [mm/rmap] 6af8cb80d3: vm-scalability.throughput
 7.8% regression

On 16.04.25 10:07, David Hildenbrand wrote:
> On 16.04.25 09:01, kernel test robot wrote:
>>
>>
>> Hello,
>>
>> kernel test robot noticed a 7.8% regression of vm-scalability.throughput on:
>>
>>
>> commit: 6af8cb80d3a9a6bbd521d8a7c949b4eafb7dba5d ("mm/rmap: basic MM owner tracking for large folios (!hugetlb)")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>>
>> testcase: vm-scalability
>> config: x86_64-rhel-9.4
>> compiler: gcc-12
>> test machine: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
>> parameters:
>>
>> 	runtime: 300s
>> 	size: 8T
>> 	test: anon-cow-seq
>> 	cpufreq_governor: performance
>>
> 
> This should be the scenario with THP enabled. At first, I thought the
> problem would be contention on the per-folio spinlock, but what makes me
> scratch my head is the following:
> 
>        13401           -16.5%      11190        proc-vmstat.thp_fault_alloc
> ...   3430623           -16.5%    2864565        proc-vmstat.thp_split_pmd
> 
> 
> If we allocate less THP, performance of the benchmark will obviously be
> worse with less THPs.
> 
> We allocated 2211 less THPs and had 566058 less THP PMD->PTE remappings.
> 
> 566058 / 2211 =  256, which is exactly the number of threads ->
> vm-scalability fork'ed child processes.
> 
> So it was in fact the benchmark that was effectively using 16.5% less THPs.
> 
> I don't see how this patch would affect the allocation of THPs in any
> way (and I don't think it does).

Thinking about this some more: Assuming both runs execute the same test 
executions, we would expect the number of allocated THPs to not change 
(unless we really have fragmentation that results in less THP getting 
allocated).

Assuming we run into a timeout after 300s and abort the test earlier, we 
could end up with a difference in executions and, therefore THP allocations.

I recall that usually we try to have the same benchmark executions and 
not run into the timeout (otherwise some of these stats, like THP 
allocations are completely unreliable).

Maybe

  7.968e+09           -16.5%  6.652e+09        vm-scalability.workload

indicates that we ended up with less executions? At least the 
"repro-script" seems to indicate that we always execute a fixed number 
of executions, but maybe the repo-script is aborted by the benchmark 
framework.

-- 
Cheers,

David / dhildenb