lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151208072508.GM20997@ZenIV.linux.org.uk>
Date:	Tue, 8 Dec 2015 07:25:08 +0000
From:	Al Viro <viro@...IV.linux.org.uk>
To:	"Suzuki K. Poulose" <suzuki.poulose@....com>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	marc.zyngier@....com, torvalds@...ux-foundation.org,
	Tejun Heo <tj@...nel.org>, stable@...r.kernel.org
Subject: Re: [PATCH] blkdev: Fix blkdev_open to release the bdev on error

On Mon, Dec 07, 2015 at 06:05:03PM +0000, Suzuki K. Poulose wrote:
> blkdev_open() doesn't release the bdev, it attached to a given
> inode, if blkdev_get() fails (e.g, due to absence of a device).
> This can cause kernel crashes when the original filesystem
> tries to flush the data during evict_inode.
> 
> This can be triggered easily with virtio-9p fs using the following
> simple steps.

???
How can filesystem type affect the behaviour of block devices?

Having mknod /tmp/splat b 8 1; rm /tmp/splat try to evict the pagecache
of /dev/sda1 is simply wrong, no matter what type /tmp happens to have.
And they must share pagecache, or you'll get one hell of cache coherency
problems.  As it is, that pagecache belongs to inode on bdevfs (see
fs/block_dev.c; not mountable anywhere visible, the one and only mount is
internal).  That inode is tied to struct bdev, ditto for its lifetime.

Block device inodes on anything else have their ->i_mapping pointing to
the corresponding (unique for given major/minor) inode on bdevfs; that
gives us the coherency, but that also means that their *own* pagecache
(->i_data) is empty.  Which is just fine, since inode eviction should
get rid of everything in its embedded struct address_space.  In case of
block device inodes on ext2, 9p, etc. that amounts to no pages at all.
In case of bdevfs, it contains the page cache of block device.

<looks> 
Aha...
        truncate_inode_pages_final(inode->i_mapping);
        clear_inode(inode);
        filemap_fdatawrite(inode->i_mapping);

in there is obviously wrong - it should be

        truncate_inode_pages_final(&inode->i_data);
        clear_inode(inode);
        filemap_fdatawrite(&inode->i_data);

and if you check other filesystems' ->evict_inode() you'll see the same thing
there.

We should not do bd_forget() upon failing open() - what for?  As long as
->i_rdev remains the same, the pointer to struct bdev is valid.  It
doesn't pin bdev down; having it (or any other alias) opened does.  When
we decide to evict bdev, *all* aliasing inodes are dissociated from it;
none of them is open at that point, so we are OK.  When an aliasing inode
gets evicted, we have it dissociated from its ->i_bdev (if any).  Since we
only access the ->i_mapping of aliasing inode while its open, those places
are fine and anything that wants ->i_data of alias will simply find it empty.

AFAICS, the cause of your oopsen is that 9p evict_inode is accessing the
object it has no business to touch.

Could you confirm that the patch below fixes your problem?

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 699941e..5110785 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -451,9 +451,9 @@ void v9fs_evict_inode(struct inode *inode)
 {
 	struct v9fs_inode *v9inode = V9FS_I(inode);
 
-	truncate_inode_pages_final(inode->i_mapping);
+	truncate_inode_pages_final(&inode->i_data);
 	clear_inode(inode);
-	filemap_fdatawrite(inode->i_mapping);
+	filemap_fdatawrite(&inode->i_data);
 
 	v9fs_cache_inode_put_cookie(inode);
 	/* clunk the fid stashed in writeback_fid */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ