[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <81fb0e7d-1879-9267-83da-4671fec50920@linux.intel.com>
Date: Mon, 8 Jul 2019 16:32:31 +0800
From: Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To: Trond Myklebust <trondmy@...merspace.com>,
"rong.a.chen@...el.com" <rong.a.chen@...el.com>
Cc: "lkp@...org" <lkp@...org>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [LKP] [SUNRPC] 0472e47660: fsmark.app_overhead 16.0% regression
Hi Trond,
I retest, it still can be reproduced. I test with the following
parameters, only change "nr_threads", the test results are as the
following. From the test results, more threads in the test, more
regression will happen. Could you help to check? Thanks.
In testcase: fsmark
on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
with 384G memory
with following parameters:
iterations: 20x
nr_threads: 1t
disk: 1BRD_48G
fs: xfs
fs2: nfsv4
filesize: 4M
test_size: 80G
sync_method: fsyncBeforeClose
cpufreq_governor: performance
test-description: The fsmark is a file system benchmark to test
synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/
commit:
e791f8e938 ("SUNRPC: Convert xs_send_kvec() to use iov_iter_kvec()")
0472e47660 ("SUNRPC: Convert socket page send code to use iov_iter()")
e791f8e9380d945e 0472e476604998c127f3c80d291
---------------- ---------------------------
%stddev %change %stddev
\ | \
59.74 -0.7% 59.32 fsmark.files_per_sec
(nr_threads= 1)
114.06 -8.1% 104.83 fsmark.files_per_sec
(nr_threads= 2)
184.53 -13.1% 160.29 fsmark.files_per_sec
(nr_threads= 4)
257.05 -15.5% 217.22 fsmark.files_per_sec
(nr_threads= 8)
306.08 -15.5% 258.68 fsmark.files_per_sec
(nr_threads=16)
498.34 -22.7% 385.33 fsmark.files_per_sec
(nr_threads=32)
527.29 -22.6% 407.96 fsmark.files_per_sec
(nr_threads=64)
On 5/31/2019 11:27 AM, Xing Zhengjun wrote:
>
>
> On 5/31/2019 3:10 AM, Trond Myklebust wrote:
>> On Thu, 2019-05-30 at 15:20 +0800, Xing Zhengjun wrote:
>>>
>>> On 5/30/2019 10:00 AM, Trond Myklebust wrote:
>>>> Hi Xing,
>>>>
>>>> On Thu, 2019-05-30 at 09:35 +0800, Xing Zhengjun wrote:
>>>>> Hi Trond,
>>>>>
>>>>> On 5/20/2019 1:54 PM, kernel test robot wrote:
>>>>>> Greeting,
>>>>>>
>>>>>> FYI, we noticed a 16.0% improvement of fsmark.app_overhead due
>>>>>> to
>>>>>> commit:
>>>>>>
>>>>>>
>>>>>> commit: 0472e476604998c127f3c80d291113e77c5676ac ("SUNRPC:
>>>>>> Convert
>>>>>> socket page send code to use iov_iter()")
>>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
>>>>>> master
>>>>>>
>>>>>> in testcase: fsmark
>>>>>> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @
>>>>>> 3.00GHz with 384G memory
>>>>>> with following parameters:
>>>>>>
>>>>>> iterations: 1x
>>>>>> nr_threads: 64t
>>>>>> disk: 1BRD_48G
>>>>>> fs: xfs
>>>>>> fs2: nfsv4
>>>>>> filesize: 4M
>>>>>> test_size: 40G
>>>>>> sync_method: fsyncBeforeClose
>>>>>> cpufreq_governor: performance
>>>>>>
>>>>>> test-description: The fsmark is a file system benchmark to test
>>>>>> synchronous write workloads, for example, mail servers
>>>>>> workload.
>>>>>> test-url: https://sourceforge.net/projects/fsmark/
>>>>>>
>>>>>>
>>>>>>
>>>>>> Details are as below:
>>>>>> -------------------------------------------------------------
>>>>>> ----
>>>>>> --------------------------------->
>>>>>>
>>>>>>
>>>>>> To reproduce:
>>>>>>
>>>>>> git clone https://github.com/intel/lkp-tests.git
>>>>>> cd lkp-tests
>>>>>> bin/lkp install job.yaml # job file is attached in
>>>>>> this
>>>>>> email
>>>>>> bin/lkp run job.yaml
>>>>>>
>>>>>> ===============================================================
>>>>>> ====
>>>>>> ======================
>>>>>> compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconf
>>>>>> ig/n
>>>>>> r_threads/rootfs/sync_method/tbox_group/test_size/testcase:
>>>>>> gcc-7/performance/1BRD_48G/4M/nfsv4/xfs/1x/x86_64-rhel-
>>>>>> 7.6/64t/debian-x86_64-2018-04-03.cgz/fsyncBeforeClose/lkp-ivb-
>>>>>> ep01/40G/fsmark
>>>>>>
>>>>>> commit:
>>>>>> e791f8e938 ("SUNRPC: Convert xs_send_kvec() to use
>>>>>> iov_iter_kvec()")
>>>>>> 0472e47660 ("SUNRPC: Convert socket page send code to use
>>>>>> iov_iter()")
>>>>>>
>>>>>> e791f8e9380d945e 0472e476604998c127f3c80d291
>>>>>> ---------------- ---------------------------
>>>>>> fail:runs %reproduction fail:runs
>>>>>> | | |
>>>>>> :4 50% 2:4 dmesg.WARNING:a
>>>>>> t#for
>>>>>> _ip_interrupt_entry/0x
>>>>>> %stddev %change %stddev
>>>>>> \ | \
>>>>>> 15118573
>>>>>> ± 2% +16.0% 17538083 fsmark.app_overhead
>>>>>> 510.93 -
>>>>>> 22.7% 395.12 fsmark.files_per_sec
>>>>>> 24.90 +22.8% 30.57 fsmark.time.ela
>>>>>> psed_
>>>>>> time
>>>>>> 24.90 +22.8% 30.57 fsmark.time.ela
>>>>>> psed_
>>>>>> time.max
>>>>>> 288.00 ± 2% -
>>>>>> 27.8% 208.00 fsmark.time.percent_of_cpu_this_job_got
>>>>>> 70.03 ± 2% -
>>>>>> 11.3% 62.14 fsmark.time.system_time
>>>>>>
>>>>>
>>>>> Do you have time to take a look at this regression?
>>>>
>>>> From your stats, it looks to me as if the problem is increased
>>>> NUMA
>>>> overhead. Pretty much everything else appears to be the same or
>>>> actually performing better than previously. Am I interpreting that
>>>> correctly?
>>> The real regression is the throughput(fsmark.files_per_sec) is
>>> decreased
>>> by 22.7%.
>>
>> Understood, but I'm trying to make sense of why. I'm not able to
>> reproduce this, so I have to rely on your performance stats to
>> understand where the 22.7% regression is coming from. As far as I can
>> see, the only numbers in the stats you published that are showing a
>> performance regression (other than the fsmark number itself), are the
>> NUMA numbers. Is that a correct interpretation?
>>
> We re-test the case yesterday, the test result almost is the same.
> we will do more test and also check the test case itself, if you need
> more information, please let me know, thanks.
>
>>>> If my interpretation above is correct, then I'm not seeing where
>>>> this
>>>> patch would be introducing new NUMA regressions. It is just
>>>> converting
>>>> from using one method of doing socket I/O to another. Could it
>>>> perhaps
>>>> be a memory artefact due to your running the NFS client and server
>>>> on
>>>> the same machine?
>>>>
>>>> Apologies for pushing back a little, but I just don't have the
>>>> hardware available to test NUMA configurations, so I'm relying on
>>>> external testing for the above kind of scenario.
>>>>
>>> Thanks for looking at this. If you need more information, please let
>>> me
>>> know.
>>>> Thanks
>>>> Trond
>>>>
>
--
Zhengjun Xing
Powered by blists - more mailing lists