lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BF7EFA4.4050901@kernel.org>
Date:	Sat, 22 May 2010 16:52:20 +0200
From:	Tejun Heo <tj@...nel.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	Ciprian Docan <docan@...n.rutgers.edu>,
	linux-kernel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>,
	Jens Axboe <jens.axboe@...cle.com>
Subject: [PATCH] vfs: don't hold s_umount over close_bdev_exclusive() call

This patch fixes an obscure AB-BA deadlock in get_sb_bdev().

When a superblock is mounted more than once get_sb_bdev() calls
close_bdev_exclusive() to drop the extra bdev reference while holding
s_umount.  However, sb->s_umount nests inside bd_mutex during
__invalidate_device() and close_bdev_exclusive() acquires bd_mutex
during blkdev_put(); thus creating an AB-BA deadlock.

This condition doesn't trigger frequently.  For this condition to be
visible to lockdep, the filesystem must occupy the whole device (as
__invalidate_device() only grabs bd_mutex for the whole device), the
FS must be mounted more than once and partition rescan should be
issued while the FS is still mounted.

Fix it by dropping s_umount over close_bdev_exclusive().

Signed-off-by: Tejun Heo <tj@...nel.org>
Reported-by: Ciprian Docan <docan@...n.rutgers.edu>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Al Viro <viro@...iv.linux.org.uk>
---
I think this fix is safe and seems to work fine here but I dunno know
the locking too well, so it would be best not to push it w/o Al's ack.

Thanks.

 fs/super.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index 1527e6a..667f706 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -821,7 +821,16 @@ int get_sb_bdev(struct file_system_type *fs_type,
 			goto error_bdev;
 		}

+		/*
+		 * s_umount nests inside bd_mutex during
+		 * __invalidate_device().  close_bdev_exclusive()
+		 * acquires bd_mutex and can't be called under
+		 * s_umount.  Drop s_umount temporarily.  This is safe
+		 * as we're holding an active reference.
+		 */
+		up_write(&s->s_umount);
 		close_bdev_exclusive(bdev, mode);
+		down_write(&s->s_umount);
 	} else {
 		char b[BDEVNAME_SIZE];

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ