lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 29 Feb 2016 13:09:24 +0000
From:	Al Viro <viro@...IV.linux.org.uk>
To:	Dmitry Vyukov <dvyukov@...gle.com>
Cc:	Ian Kent <raven@...maw.net>,
	Mickaël Salaün <mic@...ikod.net>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	syzkaller <syzkaller@...glegroups.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Howells <dhowells@...hat.com>
Subject: Re: fs: NULL deref in atime_needs_update

On Sun, Feb 28, 2016 at 08:01:01PM +0000, Al Viro wrote:
> On Sun, Feb 28, 2016 at 05:01:34PM +0000, Al Viro wrote:
> 
> > Erm...  What's to order ->d_inode and ->d_flags fetches there?  David?
> > Looks like the barrier in d_is_negative() is on the wrong side of fetch.
> > Confused...
> 
> OK, as per David's suggestion, let's flip them around, bringing the
> barrier in d_is_negative() between them.  Dmitry, could you try this on
> top of mainline?  Again, it's until the first warning.

Hmm...  Reordering is definitely wrong, but what I really wonder is if
dentry_rcuwalk_invalidate() is right outside of __d_drop().  IOW, is
it right in __d_instantiate() and dentry_unlink_inode()?  The code
dealing with ->d_flags in RCU mode is more interested in coherency between
->d_flags and ->d_inode and it looks like we'd need *two* increments -
even-to-odd before updating both and odd-to-even after both are in sync.
The more I look at the situation with d_is_...() wrt barriers and ->d_seq,
the less I understand it; outside of RCU mode we don't really need the
barriers for that stuff and in RCU mode ->d_flags handling had been
a serious headache all along...

I'm tempted to do as below; the amount of smp_wmb() remains the same and
so's the amount of stores (splitting that += 2 in two doesn't affect that -
we dirty the same cacheline before and after anyway).  OTOH, that would
mean that ->d_seq match guarantees ->d_flags and ->d_inode being in sync.  
And I suspect that we could drop _read_ barriers in d_is_...() after that;
in non-RCU mode we don't give a damn anyway and in RCU one ->d_seq check
would provide one; it doesn't really matter on x86, but smp_rmb() may be
costly.  Splitting ->d_seq increments *does* matter on x86 wrt correctness;
in-between state becomes guaranteed ->d_seq mismatch and that just might
be it.  Dmitry, could you add this on top of the previous patch?

David, Linus, do you see any problems with that?  To me it looks saner
that way and as cheap as the current code, but I might be missing something 
here...

diff --git a/fs/dcache.c b/fs/dcache.c
index 92d5140..2c08cce 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -279,7 +279,6 @@ static inline void __d_set_inode_and_type(struct dentry *dentry,
 	unsigned flags;
 
 	dentry->d_inode = inode;
-	smp_wmb();
 	flags = READ_ONCE(dentry->d_flags);
 	flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
 	flags |= type_flags;
@@ -300,7 +299,6 @@ static inline void __d_clear_type_and_inode(struct dentry *dentry)
 
 	flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
 	WRITE_ONCE(dentry->d_flags, flags);
-	smp_wmb();
 	dentry->d_inode = NULL;
 }
 
@@ -370,9 +368,11 @@ static void dentry_unlink_inode(struct dentry * dentry)
 	__releases(dentry->d_inode->i_lock)
 {
 	struct inode *inode = dentry->d_inode;
+
+	raw_write_seqcount_begin(&dentry->d_seq);
 	__d_clear_type_and_inode(dentry);
 	hlist_del_init(&dentry->d_u.d_alias);
-	dentry_rcuwalk_invalidate(dentry);
+	raw_write_seqcount_end(&dentry->d_seq);
 	spin_unlock(&dentry->d_lock);
 	spin_unlock(&inode->i_lock);
 	if (!inode->i_nlink)
@@ -1758,8 +1758,9 @@ static void __d_instantiate(struct dentry *dentry, struct inode *inode)
 	spin_lock(&dentry->d_lock);
 	if (inode)
 		hlist_add_head(&dentry->d_u.d_alias, &inode->i_dentry);
+	raw_write_seqcount_begin(&dentry->d_seq);
 	__d_set_inode_and_type(dentry, inode, add_flags);
-	dentry_rcuwalk_invalidate(dentry);
+	raw_write_seqcount_end(&dentry->d_seq);
 	spin_unlock(&dentry->d_lock);
 	fsnotify_d_instantiate(dentry, inode);
 }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ