linux-ext4 - Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110215172954.GK17313@quack.suse.cz>
Date:	Tue, 15 Feb 2011 18:29:54 +0100
From:	Jan Kara <jack@...e.cz>
To:	Ted Ts'o <tytso@....edu>
Cc:	Jan Kara <jack@...e.cz>,
	Masayoshi MIZUMA <m.mizuma@...fujitsu.com>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock

On Tue 15-02-11 12:03:52, Ted Ts'o wrote:
> On Tue, Feb 15, 2011 at 05:06:30PM +0100, Jan Kara wrote:
> > Thanks for detailed analysis. Indeed this is a bug. Whenever we do IO
> > under s_umount semaphore, we are prone to deadlock like the one you
> > describe above.
> 
> One of the fundamental problems here is that the freeze and thaw
> routines are using down_write(&sb->s_umount) for two purposes.  The
> first is to prevent the resume/thaw from racing with a umount (which
> it could do just as well by taking a read lock), but the second is to
> prevent the resume/thaw code from racing with itself.  That's the core
> fundamental problem here.
> 
> So I think we can solve this by introduce a new mutex, s_freeze, and
> having the the resume/thaw first take the s_freeze mutex and then
> second take a read lock on the s_umount.
  Sadly this does not quite work because even down_read(&sb->s_umount)
in thaw_super() can block if there is another process that tries to acquire
s_umount for writing - a situation like:
  TASK 1 (e.g. flusher)		TASK 2	(e.g. remount)		TASK 3 (unfreeze)
down_read(&sb->s_umount)
  block on s_frozen
				down_write(&sb->s_umount)
				  -blocked
								down_read(&sb->s_umount)
								  -blocked
behind the write access...

The only working solution I see is to check for frozen filesystem before
taking s_umount semaphore which seems rather ugly (but might be bearable if
we did so in some well described wrapper).

And in particular ext4 has another deadlock of this kind because it does
IO from ext4_remount() e.g. when doing online resize (I know it's a bit
artifical but still ;).

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html