lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 26 Jan 2011 16:15:29 +0800
From:	Shaohua Li <shaohua.li@...el.com>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	"Shi, Alex" <alex.shi@...el.com>, "jack@...e.cz" <jack@...e.cz>,
	"tytso@....edu" <tytso@....edu>,
	"czoccolo@...il.com" <czoccolo@...il.com>,
	"jaxboe@...ionio.com" <jaxboe@...ionio.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Chen, Tim C" <tim.c.chen@...el.com>
Subject: Re: [performance bug] kernel building regression on 64 LCPUs
 machine

On Thu, Jan 20, 2011 at 11:16:56PM +0800, Vivek Goyal wrote:
> On Wed, Jan 19, 2011 at 10:03:26AM +0800, Shaohua Li wrote:
> > add Jan and Theodore to the loop.
> >
> > On Wed, 2011-01-19 at 09:55 +0800, Shi, Alex wrote:
> > > Shaohua and I tested kernel building performance on latest kernel. and
> > > found it is drop about 15% on our 64 LCPUs NHM-EX machine on ext4 file
> > > system. We find this performance dropping is due to commit
> > > 749ef9f8423054e326f. If we revert this patch or just change the
> > > WRITE_SYNC back to WRITE in jbd2/commit.c file. the performance can be
> > > recovered.
> > >
> > > iostat report show with the commit, read request merge number increased
> > > and write request merge dropped. The total request size increased and
> > > queue length dropped. So we tested another patch: only change WRITE_SYNC
> > > to WRITE_SYNC_PLUG in jbd2/commit.c, but nothing effected.
> > since WRITE_SYNC_PLUG doesn't work, this isn't a simple no-write-merge issue.
> >
> 
> Yep, it does sound like reduce write merging. But moving journal commits
> back to WRITE, then fsync performance will drop as there will be idling
> introduced between fsync thread and journalling thread. So that does
> not sound like a good idea either.
> 
> Secondly, in presence of mixed workload (some other sync read happening)
> WRITES can get less bandwidth and sync workload much more. So by
> marking journal commits as WRITES you might increase the delay there
> in completion in presence of other sync workload.
> 
> So Jan Kara's approach makes sense that if somebody is waiting on
> commit then make it WRITE_SYNC otherwise make it WRITE. Not sure why
> did it not work for you. Is it possible to run some traces and do
> more debugging that figure out what's happening.
Sorry for the long delay.

Looks fedora enables ccache by default. While our kbuild test is on ext4 disk
but rootfs is on ext3 where ccache cache files live. Jan's patch only covers
ext4, maybe this is the reason.
I changed jbd to use WRITE for journal_commit_transaction. With the change and
Jan's patch, the test seems fine.

Jan,
can you send a patch with similar change for ext3? So we can do more tests.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ