lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 Aug 2008 16:14:43 +1000
From:	Dave Chinner <david@...morbit.com>
To:	gus3 <musicman529@...oo.com>
Cc:	Szabolcs Szakacsits <szaka@...s-3g.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	xfs@....sgi.com
Subject: Re: XFS vs Elevators (was Re: [PATCH RFC] nilfs2: continuous
	snapshotting file system)

On Wed, Aug 20, 2008 at 11:00:07PM -0700, gus3 wrote:
> --- On Wed, 8/20/08, Dave Chinner <david@...morbit.com> wrote:
> 
> > Ok, I thought it might be the tiny log, but it didn't improve
> > anything here when increased the log size, or the log buffer
> > size.
> > 
> > Looking at the block trace, I think elevator merging is somewhat
> > busted. I'm seeing adjacent I/Os being dispatched without having
> > been merged.  e.g:
> 
> [snip]
> 
> > Also, CFQ appears to not be merging WRITE_SYNC bios or issuing
> > them with any urgency.  The result of this is that it stalls the
> > XFS transaction subsystem by capturing all the log buffers in
> > the elevator and not issuing them. e.g.:
> 
> [snip]
> 
> > The I/Os are merged, but there's still that 700ms delay before
> > dispatch.  i was looking at this a while back but didn't get to
> > finishing it off.  i.e.:
> > 
> > http://oss.sgi.com/archives/xfs/2008-01/msg00151.html
> > http://oss.sgi.com/archives/xfs/2008-01/msg00152.html
> > 
> > I'll have a bit more of a look at this w.r.t to compilebench
> > performance, because it seems like a similar set of problems
> > that I was seeing back then...
> 
> I concur your observation, esp. w.r.t. XFS and CFQ clashing:
> 
> http://gus3.typepad.com/i_am_therefore_i_think/2008/07/finding-the-fas.html
> 
> CFQ is the default on most Linux systems AFAIK; for decent XFS
> performance one needs to switch to "noop" or "deadline". I wasn't
> sure if it was broken code, or simply base assumptions in conflict
> (XFS vs. CFQ). Your log output sheds light on the matter for me,
> thanks.

I'm wondering if these elevators are just getting too smart for
their own good. w.r.t to the above test, deadline was about twice
as slow as CFQ - it does immediate dispatch on SYNC_WRITE bios and
so caused more seeks that CFQ and hence went slower. noop had
similar dispatch latency problems to CFQ, so it wasn't any
faster either.

I think that we need to issue explicit unplugs to get the log I/O
dispatched the way we want on all elevators and stop trying to
give elevators implicit hints by abusing the bio types and hoping
they do the right thing....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ