lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110215180435.GH4255@thunk.org>
Date:	Tue, 15 Feb 2011 13:04:35 -0500
From:	Ted Ts'o <tytso@....edu>
To:	Jan Kara <jack@...e.cz>
Cc:	Masayoshi MIZUMA <m.mizuma@...fujitsu.com>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock

On Tue, Feb 15, 2011 at 06:29:54PM +0100, Jan Kara wrote:
>   Sadly this does not quite work because even down_read(&sb->s_umount)
> in thaw_super() can block if there is another process that tries to acquire
> s_umount for writing - a situation like:
>   TASK 1 (e.g. flusher)		TASK 2	(e.g. remount)		TASK 3 (unfreeze)
> down_read(&sb->s_umount)
>   block on s_frozen
> 				down_write(&sb->s_umount)
> 				  -blocked
> 								down_read(&sb->s_umount)
> 								  -blocked
> behind the write access...

OK, sorry for being dense, but why does this cause a deadlock?  What
are you imaging TASK 3 doing that would impede the flusher from
eventually resuming?  Or how would TASK 3 prevent userspace from
completing whatever it needs to do (say, a device mapper ioctl)?

freeze_fs has always been inherently dangerous if the userspace does
not know what it's doing.  If it freezes the root file system, and
then while the file system is frozen, userspace attempts to modify
/etc/mtab, it's going to lose.  I've in the past argued for some kind
of safety timeout that prevents the system from wedging, but the
argument I've gotten back is (a) it's too complex, and (b) userspace
programmers aren't that stupid, and (c) it could cause the filesystem
to unfreeze when userspace wasn't expecting it.  Oh, and (d) if the
system wedges up due to userspace being stupid, it's acceptable.

Obviously, if the kernel does something to itself that causes a
deadlock, we need to fix it, but userspace doing something stupid has
been explicitly ruled out of scope, at least in previous
discussions...

> And in particular ext4 has another deadlock of this kind because it does
> IO from ext4_remount() e.g. when doing online resize (I know it's a bit
> artifical but still ;).

OK, I'm being dense again.  How does remount and online resize relate
with each other?  and it's not I/O in general which is a problem, it's
writeback activity which causes a problem because it takes a read lock
on s_umount, right?

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ