lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090325123744.GK23439@duck.suse.cz>
Date:	Wed, 25 Mar 2009 13:37:44 +0100
From:	Jan Kara <jack@...e.cz>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...e.hu>, Alan Cox <alan@...rguk.ukuu.org.uk>,
	Arjan van de Ven <arjan@...radead.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>, Theodore Tso <tytso@....edu>,
	Jens Axboe <jens.axboe@...cle.com>,
	David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29

On Tue 24-03-09 04:12:49, Andrew Morton wrote:
> On Tue, 24 Mar 2009 11:31:11 +0100 Ingo Molnar <mingo@...e.hu> wrote:
> > The thing is ... this is a _bad_ ext3 design bug affecting ext3 
> > users in the last decade or so of ext3 existence. Why is this issue 
> > not handled with the utmost high priority and why wasnt it fixed 5 
> > years ago already? :-)
> > 
> > It does not matter whether we have extents or htrees when there are 
> > _trivially reproducible_ basic usability problems with ext3.
> > 
> 
> It's all there in that Oct 2008 thread.
> 
> The proposed tweak to kjournald is a bad fix - partly because it will
> elevate the priority of vast amounts of IO whose priority we don't _want_
> elevated.
> 
> But mainly because the problem lies elsewhere - in an area of contention
> between the committing and running transactions which we knowingly and
> reluctantly added to fix a bug in 
> 
> commit 773fc4c63442fbd8237b4805627f6906143204a8
> Author:     akpm <akpm>
> AuthorDate: Sun May 19 23:23:01 2002 +0000
> Commit:     akpm <akpm>
> CommitDate: Sun May 19 23:23:01 2002 +0000
> 
>     [PATCH] fix ext3 buffer-stealing
>     
>     Patch from sct fixes a long-standing (I did it!) and rather complex
>     problem with ext3.
>     
>     The problem is to do with buffers which are continually being dirtied
>     by an external agent.  I had code in there (for easily-triggerable
>     livelock avoidance) which steals the buffer from checkpoint mode and
>     reattaches it to the running transaction.  This violates ext3 ordering
>     requirements - it can permit journal space to be reclaimed before the
>     relevant data has really been written out.
>     
>     Also, we do have to reliably get a lock on the buffer when moving it
>     between lists and inspecting its internal state.  Otherwise a competing
>     read from the underlying block device can trigger an assertion failure,
>     and a competing write to the underlying block device can confuse ext3
>     journalling state completely.
  I've looked at this a bit. I suppose you mean the contention arising from
us taking the buffer lock in do_get_write_access()? But it's not obvious
to me why we'd be contending there... We call this function only for
metadata buffers (unless in data=journal mode) so there isn't huge amount
of these blocks. This buffer should be locked for a longer time only when
we do writeout for checkpoint (hmm, maybe you meant this one?). In
particular, note that we don't take the buffer lock when committing this
block to journal - we lock only the BJ_IO buffer. But in this case we wait
when the buffer is on BJ_Shadow list later so there is some contention in
this case.
  Also when I emailed with a few people about these sync problems, they
wrote that switching to data=writeback mode helps considerably so this
would indicate that handling of ordered mode data buffers is causing most
of the slowdown...

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ