[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140430195918.GS18016@ZenIV.linux.org.uk>
Date: Wed, 30 Apr 2014 20:59:18 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Miklos Szeredi <miklos@...redi.hu>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Chinner <david@...morbit.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: dcache shrink list corruption?
On Wed, Apr 30, 2014 at 08:02:27PM +0100, Al Viro wrote:
> On Wed, Apr 30, 2014 at 08:42:01PM +0200, Miklos Szeredi wrote:
>
> > Message-ID: <20140430154958.GC3113@...sk.piliscsaba.szeredi.hu>
>
> I see... Several points:
> * I still think that it's better to do handling of "we see
> DCACHE_DENTRY_KILLED already set" in dentry_kill() itself.
> * in dentry_kill(dentry, 0) we *know* that it's not on a shrink
> list - the caller has just removed it from there and we'd kept ->d_lock
> since that point. What's more, with that scheme we don't need to bother
> with can_free at all - just grab ->d_lock after ->d_release() call and check
> DCACHE_SHRINK_LIST. No sense trying to avoid that - in case where we
> could just go ahead and free the sucker, there's neither contention nor
> anybody else interested in that cacheline, so spin_lock(&dentry->d_lock)
> is pretty much free.
>
> IOW, I'd prefer to do the following (completely untested) on top of 1/6--4/6:
Sigh... One more problem:
/*
* We found an inuse dentry which was not removed from
* the LRU because of laziness during lookup. Do not free it.
*/
if (dentry->d_lockref.count) {
spin_unlock(&dentry->d_lock);
continue;
}
should become
if (dentry->d_lockref.count > 0) {
....
instead - lockref_mark_dead() slaps -128 into it, so we'll just leak all
dentries where dput() has come first and decided to leave the suckers
for us.
Another thing: I don't like what's going on with freeing vs. ->d_lock there.
Had that been a mutex, we'd definitely get a repeat of "vfs: fix subtle
use-after-free of pipe_inode_info". The question is, can spin_unlock(p)
dereference p after another CPU gets through spin_lock(p)? Linus?
It can be dealt with by setting DCACHE_RCUACCESS in "let another guy free
it" cases and playing with rcu_read_lock a bit, but I wonder whether we
need to bother - quick look through alpha/sparc/ppc shows that on those
we do not and the same is true for non-paravirt case on x86. I hadn't
checked what paravirt one does, though, and I certainly hadn't done
exhaustive search for architectures doing something weird...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists