lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1210131712.3453.43.camel@ymzhang>
Date:	Wed, 07 May 2008 11:41:52 +0800
From:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Matthew Wilcox <matthew@....cx>,
	LKML <linux-kernel@...r.kernel.org>,
	Alexander Viro <viro@....linux.org.uk>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: AIM7 40% regression with 2.6.26-rc1


On Wed, 2008-05-07 at 10:11 +0800, Zhang, Yanmin wrote:
> On Tue, 2008-05-06 at 13:44 +0200, Ingo Molnar wrote:
> > * Zhang, Yanmin <yanmin_zhang@...ux.intel.com> wrote:
> > 
> > > Comparing with kernel 2.6.25, AIM7 (use tmpfs) has more than 40% with 
> > > 2.6.26-rc1 on my 8-core stoakley, 16-core tigerton, and Itanium 
> > > Montecito. Bisect located below patch.
> > > 
> > > 64ac24e738823161693bf791f87adc802cf529ff is first bad commit
> > > commit 64ac24e738823161693bf791f87adc802cf529ff
> > > Author: Matthew Wilcox <matthew@....cx>
> > > Date:   Fri Mar 7 21:55:58 2008 -0500
> > > 
> > >     Generic semaphore implementation
> > > 
> > > After I manually reverted the patch against 2.6.26-rc1 while fixing 
> > > lots of conflictions/errors, aim7 regression became less than 2%.
> > 
> > hm, which exact semaphore would that be due to?
> > 
> > My first blind guess would be the BKL - there's not much other semaphore 
> > use left in the core kernel otherwise that would affect AIM7 normally. 
> > The VFS still makes frequent use of the BKL and AIM7 is very VFS 
> > intense. Getting rid of that BKL use from the VFS might be useful to 
> > performance anyway.
> > 
> > Could you try to check that it's indeed the BKL?
> > 
> > Easiest way to check it would be to run AIM7 it on 
> > sched-devel.git/latest and do scheduler tracing via:
> > 
> >    http://people.redhat.com/mingo/sched-devel.git/readme-tracer.txt
> One clear weird behavior of aim7 is cpu idle is 0% with 2.6.25, but is more than 50% with
> 2.6.26-rc1. I have a patch to collect schedule info.
With my patch+gprof, I collected some data. Below was outputed by gprof.

index % time    self  children    called     name
                0.00    0.00       2/223305376     __down_write_nested [22749]
                0.00    0.00       3/223305376     journal_commit_transaction [10526]
                0.00    0.00       6/223305376     __down_read [22745]
                0.00    0.00       8/223305376     start_this_handle [19167]
                0.00    0.00      15/223305376     sys_pause [19808]
                0.00    0.00      17/223305376     log_wait_commit [11047]
                0.00    0.00      20/223305376     futex_wait [8122]
                0.00    0.00      64/223305376     pdflush [14335]
                0.00    0.00      71/223305376     do_get_write_access [5367]
                0.00    0.00      84/223305376     pipe_wait [14460]
                0.00    0.00     111/223305376     kjournald [10726]
                0.00    0.00     116/223305376     int_careful [9634]
                0.00    0.00     224/223305376     do_nanosleep [5418]
                0.00    0.00    1152/223305376     watchdog [22065]
                0.00    0.00    4087/223305376     worker_thread [22076]
                0.00    0.00    5003/223305376     __mutex_lock_killable_slowpath [23305]
                0.00    0.00    7810/223305376     ksoftirqd [10831]
                0.00    0.00    9389/223305376     __mutex_lock_slowpath [23306]
                0.00    0.00   10642/223305376     io_schedule [9813]
                0.00    0.00   23544/223305376     migration_thread [11495]
                0.00    0.00   35319/223305376     __cond_resched [22673]
                0.00    0.00   49065/223305376     retint_careful [16146]
                0.00    0.00  119757/223305376     sysret_careful [20074]
                0.00    0.00  151717/223305376     do_wait [5545]
                0.00    0.00  250221/223305376     do_exit [5356]
                0.00    0.00  303836/223305376     cpu_idle [4350]
                0.00    0.00 222333093/223305376     schedule_timeout [2]
[1]      0.0    0.00    0.00 223305376         schedule [1]
-----------------------------------------------
                0.00    0.00       2/222333093     io_schedule_timeout [9814]
                0.00    0.00       4/222333093     journal_stop [10588]
                0.00    0.00       8/222333093     cifs_oplock_thread [3760]
                0.00    0.00      14/222333093     do_sys_poll [5513]
                0.00    0.00      20/222333093     cifs_dnotify_thread [3733]
                0.00    0.00      32/222333093     read_chan [15648]
                0.00    0.00      47/222333093     wait_for_common [22017]
                0.00    0.00     658/222333093     do_select [5479]
                0.00    0.00    2000/222333093     inet_stream_connect [9324]
                0.00    0.00 222330308/222333093     __down [22577]
[2]      0.0    0.00    0.00 222333093         schedule_timeout [2]
                0.00    0.00 222333093/223305376     schedule [1]
-----------------------------------------------
                0.00    0.00       1/165565      flock_lock_file_wait [7735]
                0.00    0.00       7/165565      __posix_lock_file [23371]
                0.00    0.00     203/165565      de_put [4665]
                0.00    0.00     243/165565      opost [13633]
                0.00    0.00     333/165565      proc_root_readdir [14982]
                0.00    0.00     358/165565      write_chan [22090]
                0.00    0.00    6222/165565      proc_lookup_de [14908]
                0.00    0.00   32081/165565      sys_fcntl [19687]
                0.00    0.00   36045/165565      vfs_ioctl [21822]
                0.00    0.00   42025/165565      tty_release [20818]
                0.00    0.00   48047/165565      chrdev_open [3702]
[3]      0.0    0.00    0.00  165565         lock_kernel [3]
                0.00    0.00  152987/153190      down [4]
-----------------------------------------------
                0.00    0.00     203/153190      __reacquire_kernel_lock [23420]
                0.00    0.00  152987/153190      lock_kernel [3]
[4]      0.0    0.00    0.00  153190         down [4]
                0.00    0.00  153190/153190      __down [22577]
-----------------------------------------------
                0.00    0.00  153190/153190      down [4]
[22577   0.0    0.00    0.00  153190         __down [22577]
                0.00    0.00 222330308/222333093     schedule_timeout [2]


As system idle is more than 50%, so the schedule/schedule_timeout caller is important
information.
1) lock_kernel causes most schedule/schedule_timeout;
2) When lock_kernel calls down, then __down, __down calls schedule_timeout for
lots of times in a loop;
3) Caller of lcok_kernel are sys_fcntl/vfs_ioctl/tty_release/chrdev_open.

-yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ