lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090708223150.GB14005@mit.edu>
Date:	Wed, 8 Jul 2009 18:31:50 -0400
From:	Theodore Tso <tytso@....edu>
To:	Jan Kara <jack@...e.cz>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: [PATCH] jbd2: Fix a race between checkpointing code and
	journal_get_write_access()

On Sun, Jul 05, 2009 at 10:53:19PM -0400, Theodore Tso wrote:
> On Wed, Jun 24, 2009 at 06:02:40PM +0200, Jan Kara wrote:
> > The following race can happen:
> > 
> >   CPU1                          CPU2
> >                                 checkpointing code checks the buffer, adds
> >                                   it to an array for writeback
> > do_get_write_access()
> >   ...
> >   lock_buffer()
> >   unlock_buffer()
> >                                   flush_batch() submits the buffer for IO
> >   __jbd2_journal_file_buffer()
> > 
> >   So a buffer under writeout is returned from do_get_write_access(). Since
> > the filesystem code relies on the fact that journaled buffers cannot be
> > written out, it does not take the buffer lock and so it can modify buffer
> > while it is under writeout. That can lead to a filesystem corruption
> > if we crash at the right moment.
> >   We fix the problem by clearing the buffer dirty bit under buffer_lock
> > even if the buffer is on BJ_None list. Actually, we clear the dirty bit
> > regardless the list the buffer is in and warn about the fact if
> > the buffer is already journalled.

When running fsstress, we get the "Spotted dirty metadata buffer;
there's a risk of filesystem corruption in csae of a system crash" at
least half a dozen times or so.  That sounds like we have a problem.
Were you expecting that this was a "this should never happen"
situation, or is there a known bug that we need to fix here?

	      	       	       	   	- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ