linux-kernel - Re: [LKP] [x86, sched] 1567c3e346: vm-scalability.median -15.8% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <d595b16d-93ad-8e4e-21e0-bf0e44845507@linux.intel.com>
Date:   Thu, 23 Jul 2020 15:14:02 +0800
From:   Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To:     Giovanni Gherdovich <ggherdovich@...e.cz>,
        kernel test robot <oliver.sang@...el.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Doug Smythies <dsmythies@...us.net>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...ts.01.org
Subject: Re: [LKP] [x86, sched] 1567c3e346: vm-scalability.median -15.8%
 regression



On 7/9/2020 8:43 PM, Giovanni Gherdovich wrote:
> On Tue, 2020-07-07 at 10:58 +0800, Xing Zhengjun wrote:
>>
>> On 6/12/2020 4:11 PM, Xing Zhengjun wrote:
>>> Hi Giovanni,
>>>
>>>      I test the regression, it still existed in v5.7.  Do you have time
>>> to take a look at this? Thanks.
>>>
>>
>> Ping...
>>
> 
> Hello,
> 
> I haven't sat down to reproduce this yet but I've read the benchmark code and
> configuration, and this regression seems likely to be more of a benchmarking
> artifact than an actual performance bug.
> 
> Likely a benchmarking artifact:
> 
> First off, the test used the "performance" governor from the "intel_pstate"
> cpufreq driver, but points at the patch introducing the "frequency invariance
> on x86" feature as the culprit. This is suspicious because "frequency
> invariance on x86" influences frequency selection when the "schedutil" governor
> is in use (not your case). It may also affect the scheduler load balancing but
> here you have $NUM_CPUS processes so there isn't a lot of room for creativity
> there, each CPU gets a process.
> 
> Some notes on this benchmark for my future reference:
> 
> The test in question is "anon-cow-seq" from "vm-scalability", which is based
> on the "usemem" program originally written by Andrew Morton and exercises the
> memory management subsystem. The invocation is:
> 
>      usemem --nproc $NUM_CPUS   \
> 	   --prealloc          \
> 	   --prefault          \
> 	   $SIZE
> 
> What this does is to create an anonymous mmap()-ing of $SIZE bytes in the main
> process, fork $NUM_CPUS distinct child processes and have all of them scan the
> mapping sequentially from byte 0 to byte N, writing 0, 1, 2, ..., N on the
> region as they scan it, all together at the same time. So we have the "anon"
> part (the mapping isn't file-backed), the "cow" part (the parent process
> allocates the region, then each children copy-on-write's to it) and the "seq"
> part (memory accesses happen sequentially from low to high address). The test
> measures how quick this happens; I believe the regression happens in the
> median time it takes a process to finish (or the median throughput, but $SIZE
> is fixed so it's equivalent).
> 
> The $SIZE parameter is selected so that there is enough space for everybody:
> each children plus the parent need a copy of the mapped region, so that makes
> $NUM_CPUS+1 instances. The formula for $SIZE adds a factor 2 for good measure:
> 
>      SIZE = $MEM_SIZE / ($NUM_CPUS + 1) / 2
> 
> So we have a benchmark dominated by page allocation and copying, run with the
> "performance" cpufreq governor, and your bisections points to a commit such as
> 1567c3e3467cddeb019a7b53ec632f834b6a9239 ("x86, sched: Add support for
> frequency invariance") which:
> 
> * changes how frequency is selected by a governor you're not using
> * doesn't touch the memory management subsystem or related functions
> 
> I'm not entirely dismissing your finding, just explaining why this analysis
> hasn't been in my top priorities lately (plus, I've just returned from a 3
> weeks vacation :). I'm curious too about what causes the test to go red, but
> I'm not overly worried given the above context.
> 
> 
> Thanks,
> Giovanni Gherdovich
> 

This regression only happened on the testbox "lkp-hsw-4ex1", the machine 
hardware info:
model: Haswell-EX
nr_node: 4
nr_cpu: 144
memory: 512G
brand: Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz

We have ever reproduced it for many times, but recently we upgrade both 
software and hardware for it, then we can not reproduce the regression 
on it, we also try to revert the upgrade, it still can not be 
reproduced. We will continue to run the test case and once the 
regression reproduced will let you know.


-- 
Zhengjun Xing