lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Sep 2009 09:10:59 -0700
From:	Sage Weil <sage@...dream.net>
To:	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:	viro@...IV.linux.org.uk, hch@...radead.org, adilger@....com,
	yehuda@...dream.net, garlick@...l.gov, raven@...maw.net,
	Sage Weil <sage@...dream.net>,
	Al Viro <viro@...iv.linux.org.uk>
Subject: [PATCH] vfs: make real_lookup do dentry revalidation with i_mutex held

real_lookup() is called by do_lookup() if dentry revalidation fails.  If
the cache is re-populated while waiting for i_mutex, it may find that
a d_lookup() subsequently succeeds (see the "Uhhuh! Nasty case" comment).

Previously, real_lookup() would drop i_mutex and do_revalidate() again. If
revalidate failed _again_, however, it would give up with -ENOENT.  The
problem here that network file systems may be invalidating dentries via
server callbacks, e.g. due to concurrent access from another client, and
-ENOENT is frequently the wrong answer.

This problem has been seen with both Lustre and Ceph.  It seems possible
to hit this case with NFS as well if the cache lifetime is very short.

Instead, we should do_revalidate() while i_mutex is still held.  If
revalidation fails, we can move on to a ->lookup() and ensure a correct
result without worrying about any subsequent races.

Note that do_revalidate() is called with i_mutex held elsewhere.  For
example, do_filp_open(), lookup_create(), do_unlinkat(), do_rmdir(),
and possibly others all take the directory i_mutex, and then

-> lookup_hash
        -> __lookup_hash
                -> cached_lookup
                        -> do_revalidate

so this does not introduce any new locking rules for d_revalidate
implementations.

Yes, the goto is ugly.  A cleanup patch follows.

CC: Ian Kent <raven@...maw.net>
CC: Christoph Hellwig <hch@...radead.org>
CC: Al Viro <viro@...iv.linux.org.uk>
CC: Andreas Dilger <adilger@....com>
Signed-off-by: Yehuda Sadeh <yehuda@...dream.net>
Signed-off-by: Sage Weil <sage@...dream.net>
---
 fs/namei.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index d11f404..f74ddb3 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -497,6 +497,7 @@ static struct dentry * real_lookup(struct dentry * parent, struct qstr * name, s
 	if (!result) {
 		struct dentry *dentry;
 
+do_the_lookup:
 		/* Don't create child dentry for a dead directory. */
 		result = ERR_PTR(-ENOENT);
 		if (IS_DEADDIR(dir))
@@ -520,12 +521,12 @@ out_unlock:
 	 * Uhhuh! Nasty case: the cache was re-populated while
 	 * we waited on the semaphore. Need to revalidate.
 	 */
-	mutex_unlock(&dir->i_mutex);
 	if (result->d_op && result->d_op->d_revalidate) {
 		result = do_revalidate(result, nd);
 		if (!result)
-			result = ERR_PTR(-ENOENT);
+			goto do_the_lookup;
 	}
+	mutex_unlock(&dir->i_mutex);
 	return result;
 }
 
-- 
1.5.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ