[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0a3f09f3-6712-a66f-8a9a-fb31b9d1a564@intel.com>
Date: Tue, 19 Jan 2021 11:01:32 +0800
From: "Xing, Zhengjun" <zhengjun.xing@...el.com>
To: Ming Lei <ming.lei@...hat.com>,
kernel test robot <oliver.sang@...el.com>
Cc: Jens Axboe <axboe@...nel.dk>,
Veronika Kabatova <vkabatov@...hat.com>,
Christoph Hellwig <hch@....de>, Tejun Heo <tj@...nel.org>,
Sagi Grimberg <sagi@...mberg.me>,
Bart Van Assche <bvanassche@....org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com
Subject: Re: [LKP] Re: [percpu_ref] 2b0d3d3e4f: reaim.jobs_per_min -18.4%
regression
On 1/11/2021 5:58 PM, Ming Lei wrote:
> On Sun, Jan 10, 2021 at 10:32:47PM +0800, kernel test robot wrote:
>> Greeting,
>>
>> FYI, we noticed a -18.4% regression of reaim.jobs_per_min due to commit:
>>
>>
>> commit: 2b0d3d3e4fcfb19d10f9a82910b8f0f05c56ee3e ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>>
>> in testcase: reaim
>> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
>> with following parameters:
>>
>> runtime: 300s
>> nr_task: 100%
>> test: short
>> cpufreq_governor: performance
>> ucode: 0x5002f01
>>
>> test-description: REAIM is an updated and improved version of AIM 7 benchmark.
>> test-url: https://sourceforge.net/projects/re-aim-7/
>>
>> In addition to that, the commit also has significant impact on the following tests:
>>
>> +------------------+---------------------------------------------------------------------------+
>> | testcase: change | vm-scalability: vm-scalability.throughput -2.8% regression |
>> | test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
>> | test parameters | cpufreq_governor=performance |
>> | | runtime=300s |
>> | | test=lru-file-mmap-read-rand |
>> | | ucode=0x5003003 |
>> +------------------+---------------------------------------------------------------------------+
>> | testcase: change | will-it-scale: will-it-scale.per_process_ops 14.5% improvement |
>> | test machine | 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory |
>> | test parameters | cpufreq_governor=performance |
>> | | mode=process |
>> | | nr_task=50% |
>> | | test=page_fault2 |
>> | | ucode=0x16 |
>> +------------------+---------------------------------------------------------------------------+
>> | testcase: change | will-it-scale: will-it-scale.per_process_ops -13.0% regression |
>> | test machine | 104 threads Skylake with 192G memory |
>> | test parameters | cpufreq_governor=performance |
>> | | mode=process |
>> | | nr_task=50% |
>> | | test=malloc1 |
>> | | ucode=0x2006906 |
>> +------------------+---------------------------------------------------------------------------+
>> | testcase: change | vm-scalability: vm-scalability.throughput -2.3% regression |
>> | test machine | 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G memory |
>> | test parameters | cpufreq_governor=performance |
>> | | runtime=300s |
>> | | test=lru-file-mmap-read-rand |
>> | | ucode=0x5002f01 |
>> +------------------+---------------------------------------------------------------------------+
>> | testcase: change | fio-basic: fio.read_iops -4.8% regression |
>> | test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
>> | test parameters | bs=4k |
>> | | cpufreq_governor=performance |
>> | | disk=2pmem |
>> | | fs=xfs |
>> | | ioengine=libaio |
>> | | nr_task=50% |
>> | | runtime=200s |
>> | | rw=randread |
>> | | test_size=200G |
>> | | time_based=tb |
>> | | ucode=0x5002f01 |
>> +------------------+---------------------------------------------------------------------------+
>> | testcase: change | stress-ng: stress-ng.stackmmap.ops_per_sec -45.4% regression |
>> | test machine | 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory |
>> | test parameters | class=memory |
>> | | cpufreq_governor=performance |
>> | | disk=1HDD |
>> | | nr_threads=100% |
>> | | testtime=10s |
>> | | ucode=0x5002f01 |
>> +------------------+---------------------------------------------------------------------------+
> Just run a quick test of the last two on 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of
> percpu_ref in fast path) and cf785af19319 ("block: warn if !__GFP_DIRECT_RECLAIM in bio_crypt_set_ctx()").
>
> Not see difference in the two kernel(fio on null_blk with 224 hw queues,
> and 'stress-ng --stackmmap-ops') on one 224 cores, dual sockets system.
>
> BTW this patch itself doesn't touch fast path code, so it is supposed to
> not affect performance.
>
> Can you double check if the test itself is good?
I re-test the "fio-basic: fio.read_iops -4.8% regression" for more than
5 times, the average regression is -2.3%.
For "stress_ng", normally, it tests a lot of cases one by one. Command
'stress-ng --stackmmap-ops' only test "stackmmap" case.
I also tried only test "stackmmap" case, the regression is -7.8%.
But for here, it mainly reports "reaim.jobs_per_min -18.4% regression",
I re-test "reaim" case, the result is almost the same.
>
> Note: cf785af19319 is 2b0d3d3e4fcf^
>
>
>
> Thanks,
> Ming
> _______________________________________________
> LKP mailing list -- lkp@...ts.01.org
> To unsubscribe send an email to lkp-leave@...ts.01.org
--
Zhengjun Xing
Powered by blists - more mailing lists