linux-kernel - Re: [PATCH] reconnect_one(): fix a missing error code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87lgou6xqm.fsf@notabene.neil.brown.name>
Date:   Thu, 15 Jun 2017 07:54:57 +1000
From:   NeilBrown <neilb@...e.com>
To:     "J. Bruce Fields" <bfields@...ldses.org>,
        Dan Carpenter <dan.carpenter@...cle.com>
Cc:     "J. Bruce Fields" <bfields@...hat.com>,
        David Howells <dhowells@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
        kernel-janitors@...r.kernel.org
Subject: Re: [PATCH] reconnect_one(): fix a missing error code

On Wed, Jun 14 2017, J. Bruce Fields wrote:

> On Wed, Jun 14, 2017 at 12:30:02PM +0300, Dan Carpenter wrote:
>> I found this bug by reviewing places where we do ERR_PTR(0) (which is
>> NULL).
>> 
>> We used to return an error pointer if lookup_one_len() failed but we
>> moved this code into a helper function and accidentally removed that.
>> NULL is a valid return for this function but it's not what we intended.
>> 
>> Fixes: bbf7a8a3562f ("exportfs: move most of reconnect_path to helper function")
>> Signed-off-by: Dan Carpenter <dan.carpenter@...cle.com>
>
> ACK.  Agreed that the current code is wrong, and that this is the
> correct fix.
>
> What I don't quite understand yet is what the impact of the bug would
> be.
>

It is interesting that reconnect_path() handles the possibility of
reconnect_one() returning NULL, even though it will only do that if this
"bug" is triggered.
When that happens, the target_dir (a descendent of dentry) gets its
DCACHE_DISCONNECTED flag cleared.

The bug can presumably only be triggered by a race.
We look through a directory to find the name for an  inode
(exportfs_get_name), then try to look up that name and it doesn't exist.

So presumably if you lose the race, some dentry will get
DCACHE_DISCONNECTED cleared, even though it is still disconnected.
This breaks a contract and can cause weirdness in dcache operations.

If the lookup_one_len_unlocked() fails, we should probably retry, at
least once.  But if we do decide to give up, we shouldn't assume it all
worked.

So I suggest:
 - the fix as provided by Dan, plus
 - remove "if (!parent) break;" from reconnect_path(), plus
 - maybe retry the get_name/lookup_one operation once if the first
    attempt fails.

NeilBrown


> --b.
>
>> 
>> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
>> index 329a5d103846..451237745689 100644
>> --- a/fs/exportfs/expfs.c
>> +++ b/fs/exportfs/expfs.c
>> @@ -147,6 +147,7 @@ static struct dentry *reconnect_one(struct vfsmount *mnt,
>>  	tmp = lookup_one_len_unlocked(nbuf, parent, strlen(nbuf));
>>  	if (IS_ERR(tmp)) {
>>  		dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
>> +		err = PTR_ERR(tmp);
>>  		goto out_err;
>>  	}
>>  	if (tmp != dentry) {

Download attachment "signature.asc" of type "application/pgp-signature" (833 bytes)