lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110729013830.GC5404@dastard>
Date:	Fri, 29 Jul 2011 11:38:30 +1000
From:	Dave Chinner <david@...morbit.com>
To:	linux-kernel@...r.kernel.org
Cc:	luto@....EDU, mingo@...e.hu
Subject: [3.0-rc0 Regression]: legacy vsyscall emulation increases user CPU
 time by 20%

Hi folks,

It-s merge window again, which means I'm doing my usual "where did
the XFS performance go" bisects again. The usual workload:

$ time sudo mkfs.xfs -f -d agcount=32 -l size=131072b /dev/vda
meta-data=/dev/vda               isize=256    agcount=32, agsize=134217728 blks
         =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=4096   blocks=4294967296, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=131072, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

real    0m4.496s
user    0m0.012s
sys     0m0.196s
$ sudo mount -o nobarrier,logbsize=262144,inode64 /dev/vda /mnt/scratch
$ sudo chmod 777 /mnt/scratch
$ cd /home/dave/src/fs_mark-3.3/
$ /usr/bin/time ./fs_mark -D 10000 -S0 -n 100000 -s 0 -L 63 -d /mnt/scratch/0 -d /mnt/scratch/1 -d /mnt/scratch/2 -d /mnt/scratch/3 -d /mnt/scratch/4 -d /mnt/scratch/5 -d /mnt/scratch/6 -d /mnt/scratch/7

On a 8p 4GB RAM VM and a 16TB filesystem. On 3.0, this completes in:

     0     48000000            0     110068.2          7165149
     0     48800000            0     112275.1          7523326
     0     49600000            0     111725.3          9002077
     0     50400000            0     109074.4          7111665
246.67user 2903.42system 8:16.96elapsed 633%CPU (0avgtext+0avgdata 82544maxresident)k
0inputs+0outputs (1639major+2603202minor)pagefaults 0swaps
$

The completion time over multiple runs is ~8m10s +/-5s, and the user
CPU time is roughly 245s +/-5s

Enter 5cec93c2 ("x86-64: Emulate legacy vsyscalls") and the result
ends up at:

     0     48000000            0     108975.2          9507483
     0     48800000            0     114676.5          8604471
     0     49600000            0      98062.0          8921525
     0     50400000            0     103864.7          8218302
287.35user 2933.90system 8:33.11elapsed 627%CPU (0avgtext+0avgdata 82560maxresident)k
0inputs+0outputs (1664major+2603457minor)pagefaults 0swaps

Noticable slow wall time with more variance - it's at 8m30s +/-10s,
and the user CPU time is at 290s +/-5s. So the benchmark is slower to
complete and consumes 20% more CPU in userspace. The following commit
c971294 x86-64: ("Improve vsyscall emulation CS and RIP handling")
also contributes to the slowdown a bit.

FYI, fs_mark does a lot of gettimeofday() calls - one before and
after every syscall that does filesystem work so it can calculate
the syscall times and the amount of time spent not doing syscalls.
I'm assuming this is the problem based on the commit message.
Issuing hundreds of thousands of getimeofday calls per second spread
across multiple CPUs is not uncommon, especially in benchmark or
performance measuring software. If that is the cause, then these
commits add -significant- overhead to that process.

Assuming this is the problem, can this be fixed without requiring
the whole world having to wait for the current glibc dev tree to
filter down into distro repositories?

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ