lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180226083807.GE8942@yexl-desktop>
Date:   Mon, 26 Feb 2018 16:38:07 +0800
From:   Ye Xiaolong <xiaolong.ye@...el.com>
To:     Jeff Layton <jlayton@...hat.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: Re: [lkp-robot] [iversion]  c0cef30e4f:  aim7.jobs-per-min -18.0%
 regression

On 02/25, Jeff Layton wrote:
>On Sun, 2018-02-25 at 23:05 +0800, kernel test robot wrote:
>> Greeting,
>> 
>> FYI, we noticed a -18.0% regression of aim7.jobs-per-min due to commit:
>> 
>> 
>> commit: c0cef30e4ff0dc025f4a1660b8f0ba43ed58426e ("iversion: make inode_cmp_iversion{+raw} return bool instead of s64")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> 
>> in testcase: aim7
>> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
>> with following parameters:
>> 
>> 	disk: 4BRD_12G
>> 	md: RAID0
>> 	fs: xfs
>> 	test: disk_src
>> 	load: 3000
>> 	cpufreq_governor: performance
>> 
>> test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
>> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
>> 
>> 
>
>I'm a bit suspicious of this result.
>
>This patch only changes inode_cmp_iversion{+raw} (since renamed to
>inode_eq_iversion{+raw}), and that neither should ever be called from
>xfs. The patch is fairly trivial too, and I wouldn't expect a big
>performance hit.

I tried to queue 4 more times test for both commit c0cef30e4f and its parent,
the result seems quite stable.

c0cef30e4ff0dc025f4a1660b8f0ba43ed58426e:
 "aim7.jobs-per-min": [
    32964.01,
    32938.68,
    33068.18,
    32886.32,
    32843.72,
    32798.83,
    32898.34,
    32952.55
  ],

3da90b159b146672f830bcd2489dd3a1f4e9e089:
  "aim7.jobs-per-min": [
    40239.65,
    40163.33,
    40353.32,
    39976.9,
    40185.75,
    40411.3,
    40213.58,
    39900.69
  ],

Any other test data you may need?

>
>Is IMA involved here at all? I didn't see any evidence of it, but the
>kernel config did have it enabled.
>

Sorry, not quite familiar with IMA, could you tell more about how to check it?

Thanks,
Xiaolong
>
>> 
>> Details are as below:
>> -------------------------------------------------------------------------------------------------->
>> 
>> 
>> To reproduce:
>> 
>>         git clone https://github.com/intel/lkp-tests.git
>>         cd lkp-tests
>>         bin/lkp install job.yaml  # job file is attached in this email
>>         bin/lkp run     job.yaml
>> 
>> =========================================================================================
>> compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
>>   gcc-7/performance/4BRD_12G/xfs/x86_64-rhel-7.2/3000/RAID0/debian-x86_64-2016-08-31.cgz/lkp-ivb-ep01/disk_src/aim7
>> 
>> commit: 
>>   3da90b159b (" f2fs-for-4.16-rc1")
>>   c0cef30e4f ("iversion: make inode_cmp_iversion{+raw} return bool instead of s64")
>> 
>> 3da90b159b146672 c0cef30e4ff0dc025f4a1660b8 
>> ---------------- -------------------------- 
>>          %stddev     %change         %stddev
>>              \          |                \  
>>      40183           -18.0%      32964        aim7.jobs-per-min
>>     448.60           +21.9%     546.68        aim7.time.elapsed_time
>>     448.60           +21.9%     546.68        aim7.time.elapsed_time.max
>>       5615 ±  5%     +33.4%       7489 ±  4%  aim7.time.involuntary_context_switches
>>       3086           +14.0%       3518        aim7.time.system_time
>>   19439782            -5.6%   18359474        aim7.time.voluntary_context_switches
>>     199333           +14.3%     227794 ±  2%  interrupts.CAL:Function_call_interrupts
>>       0.59            -0.1        0.50        mpstat.cpu.usr%
>>    2839401           +16.0%    3293688        softirqs.SCHED
>>    7600068           +15.1%    8747820        softirqs.TIMER
>>     118.00 ± 43%     +98.7%     234.50 ± 15%  vmstat.io.bo
>>      87840           -22.4%      68154        vmstat.system.cs
>>     552798 ±  6%     +15.8%     640107 ±  4%  numa-numastat.node0.local_node
>>     557345 ±  6%     +15.7%     644666 ±  4%  numa-numastat.node0.numa_hit
>>     528341 ±  7%     +21.7%     642933 ±  4%  numa-numastat.node1.local_node
>>     531604 ±  7%     +21.6%     646209 ±  4%  numa-numastat.node1.numa_hit
>>  2.147e+09           -12.4%   1.88e+09        cpuidle.C1.time
>>   13702041           -14.7%   11683737        cpuidle.C1.usage
>>  2.082e+08 ±  4%     +28.1%  2.667e+08 ±  5%  cpuidle.C1E.time
>>  4.719e+08 ±  2%     +23.1%  5.807e+08 ±  4%  cpuidle.C3.time
>>  1.141e+10           +31.0%  1.496e+10        cpuidle.C6.time
>>   15672622           +27.8%   20031028        cpuidle.C6.usage
>>   13520572 ±  3%     +29.5%   17514398 ±  9%  cpuidle.POLL.time
>>     278.25 ±  5%     -46.0%     150.25 ± 73%  numa-vmstat.node0.nr_dirtied
>>       3200 ± 14%     -20.6%       2542 ± 19%  numa-vmstat.node0.nr_mapped
>>     277.75 ±  5%     -46.2%     149.50 ± 73%  numa-vmstat.node0.nr_written
>>      28.50 ± 52%    +448.2%     156.25 ± 70%  numa-vmstat.node1.nr_dirtied
>>       2577 ± 19%     +26.3%       3255 ± 15%  numa-vmstat.node1.nr_mapped
>>     634338 ±  4%      +7.8%     683959 ±  4%  numa-vmstat.node1.numa_hit
>>     457411 ±  6%     +10.8%     506800 ±  5%  numa-vmstat.node1.numa_local
>>       3734 ±  8%     -11.5%       3306 ±  6%  proc-vmstat.numa_hint_faults_local
>>    1114538           +18.3%    1318978        proc-vmstat.numa_hit
>>    1106722           +18.5%    1311136        proc-vmstat.numa_local
>>      22100            +7.5%      23753 ±  4%  proc-vmstat.numa_pages_migrated
>>    1174556           +18.0%    1386359        proc-vmstat.pgalloc_normal
>>    1241445           +18.1%    1466086        proc-vmstat.pgfault
>>    1138310           +19.3%    1358132        proc-vmstat.pgfree
>>      22100            +7.5%      23753 ±  4%  proc-vmstat.pgmigrate_success
>>      53332 ± 43%    +143.0%     129617 ± 14%  proc-vmstat.pgpgout
>>       1.42 ±  2%      +1.7        3.07        perf-stat.branch-miss-rate%
>>  1.064e+10          +123.3%  2.375e+10        perf-stat.branch-misses
>>      10.79            +0.6       11.43        perf-stat.cache-miss-rate%
>>  5.583e+09            +5.9%  5.915e+09        perf-stat.cache-misses
>>   39652092            -5.0%   37662545        perf-stat.context-switches
>>       1.29           +11.7%       1.44        perf-stat.cpi
>>  4.637e+12           +12.8%   5.23e+12        perf-stat.cpu-cycles
>>  8.653e+11            +9.8%  9.498e+11 ±  2%  perf-stat.dTLB-loads
>>  3.654e+11           +12.4%  4.109e+11        perf-stat.dTLB-stores
>>       0.78           -10.5%       0.70        perf-stat.ipc
>>    1214932           +17.9%    1432266        perf-stat.minor-faults
>>  1.334e+09            -1.8%   1.31e+09        perf-stat.node-store-misses
>>  1.651e+09            -1.8%   1.62e+09        perf-stat.node-stores
>>    1214954           +17.9%    1432313        perf-stat.page-faults
>>     256.75          -100.0%       0.00        turbostat.Avg_MHz
>>      21.39           -21.4        0.00        turbostat.Busy%
>>       1200          -100.0%       0.00        turbostat.Bzy_MHz
>>   13695007          -100.0%       0.00        turbostat.C1
>>      11.92           -11.9        0.00        turbostat.C1%
>>    2116683 ±  2%    -100.0%       0.00        turbostat.C1E
>>       1.16 ±  4%      -1.2        0.00        turbostat.C1E%
>>    3112269          -100.0%       0.00        turbostat.C3
>>       2.62 ±  2%      -2.6        0.00        turbostat.C3%
>>   15671277          -100.0%       0.00        turbostat.C6
>>      63.38           -63.4        0.00        turbostat.C6%
>>      49.46          -100.0%       0.00        turbostat.CPU%c1
>>       1.42 ±  2%    -100.0%       0.00        turbostat.CPU%c3
>>      27.73          -100.0%       0.00        turbostat.CPU%c6
>>      31.41          -100.0%       0.00        turbostat.CorWatt
>>      63.25          -100.0%       0.00        turbostat.CoreTmp
>>   18919351          -100.0%       0.00        turbostat.IRQ
>>       1.21 ± 18%    -100.0%       0.00        turbostat.Pkg%pc2
>>       0.67 ± 31%    -100.0%       0.00        turbostat.Pkg%pc6
>>      63.25          -100.0%       0.00        turbostat.PkgTmp
>>      57.63          -100.0%       0.00        turbostat.PkgWatt
>>      30.73          -100.0%       0.00        turbostat.RAMWatt
>>      36030          -100.0%       0.00        turbostat.SMI
>>       3000          -100.0%       0.00        turbostat.TSC_MHz
>> 
>>                                                                                 
>>                                   aim7.jobs-per-min                             
>>                                                                                 
>>   41000 +-+-----------------------------------------------------------------+   
>>         |  ..+....+....  ..+....+....+....   ..+....+....+...+....          |   
>>   40000 +-+            +.                 +..                     +         |   
>>   39000 +-+                                                                 |   
>>         |                                                                   |   
>>   38000 +-+                                                                 |   
>>   37000 +-+                                                                 |   
>>         |                                                                   |   
>>   36000 +-+                                                                 |   
>>   35000 +-+                                                                 |   
>>         |                                                                   |   
>>   34000 +-+                                                                 |   
>>   33000 +-+                                         O                  O    |   
>>         O    O    O    O   O    O    O    O    O         O   O    O         O   
>>   32000 +-+-----------------------------------------------------------------+   
>>                                                                                 
>>                                                                                                                                                                                                                                         
>>                                                                                 
>> [*] bisect-good sample
>> [O] bisect-bad  sample
>> 
>> 
>> 
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>> 
>> 
>> Thanks,
>> Xiaolong
>-- 
>Jeff Layton <jlayton@...hat.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ