lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 22 Apr 2019 23:02:28 +0800
From:   Hou Tao <houtao1@...wei.com>
To:     <viro@...iv.linux.org.uk>, <linux-fsdevel@...r.kernel.org>
CC:     <linux-kernel@...r.kernel.org>, <yi.zhang@...wei.com>
Subject: Re: [PATCH] dcache: ensure d_flags & d_inode are consistent in
 lookup_fast()

ping ?

On 2019/4/19 16:48, Hou Tao wrote:
> After extending the size of dentry from 192-bytes to 208-bytes
> under aarch64, we got oops during the running of xfstests generic/429:
> 
>   Unable to handle kernel NULL pointer dereference at virtual address 0000000000000002
>   CPU: 3 PID: 2725 Comm: t_encrypted_d_r Tainted: G      D           5.1.0-rc4
>   pc : inode_permission+0x28/0x160
>   lr : link_path_walk.part.11+0x27c/0x528
>   ......
>   Call trace:
>    inode_permission+0x28/0x160
>    link_path_walk.part.11+0x27c/0x528
>    path_lookupat+0x64/0x208
>    filename_lookup+0xa0/0x178
>    user_path_at_empty+0x58/0x70
>    vfs_statx+0x94/0x118
>    __se_sys_newfstatat+0x58/0x98
>    __arm64_sys_newfstatat+0x24/0x30
>    el0_svc_common+0x7c/0x148
>    el0_svc_handler+0x38/0x88
>    el0_svc+0x8/0xc
> 
> If we revert the size extension of dentry, the oops will be gone.
> However if we just move the d_inode field from the begin of dentry
> struct to the end of dentry struct (poorly simulate the way how
> __randomize_layout works), the oops will reoccur.
> 
> The following scenario illustrates the problem:
> 
> precondition:
> * dentry A has just been unlinked and becomes a negative dentry
> * dentry A is encrypted, so it has d_revalidate hook: fscrypt_d_revalidate()
> * lookup process is looking A/file, and creation process is creating A
> 
> lookup process:                 creation process:
> 
> lookup_fast
>     __d_lookup_rcu returns dentry A
> 
>     d_revalidate returns -ECHILD
> 
>     d_revalidate again succeed
> 				__d_set_inode_and_type
> 				dentry->d_inode = inode
> 				WRITE_ONCE(dentry->d_flags, flags)
> 
>     d_is_negative(dentry) return false
>     follow_managed doesn't nothing
>     // inconsistent with d_flags
>     d_backing_inode() return NULL
>     nd->inode = NULL
> 
> may_lookup()
>     // oops occurs
>     inode_permission(nd->inode
> 
> The root cause is the inconsistency between d_flags & d_inode
> during the REF-walk in lookup_fast(): d_is_negative(dentry)
> returns false, but d_backing_inode() still returns a NULL pointer.
> 
> The RCU-walk path in lookup_fast() uses d_seq to ensure d_flags & d_inode
> are consistent, and lookup_slow() use inode lock to ensure that, so only
> the REF-walk path in lookup_fast() is problematic.
> 
> Fixing it by adding a paired smp_rmb/smp_wmb between the reading/writing
> of d_inode & d_flags to ensure the consistency.
> 
> Signed-off-by: Hou Tao <houtao1@...wei.com>
> ---
>  fs/dcache.c | 2 ++
>  fs/namei.c  | 7 +++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/fs/dcache.c b/fs/dcache.c
> index aac41adf4743..1eb85f9fcb0f 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -316,6 +316,8 @@ static inline void __d_set_inode_and_type(struct dentry *dentry,
>  	unsigned flags;
>  
>  	dentry->d_inode = inode;
> +	/* paired with smp_rmb() in lookup_fast() */
> +	smp_wmb();
>  	flags = READ_ONCE(dentry->d_flags);
>  	flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
>  	flags |= type_flags;
> diff --git a/fs/namei.c b/fs/namei.c
> index dede0147b3f6..833f760c70b2 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1628,6 +1628,13 @@ static int lookup_fast(struct nameidata *nd,
>  		return -ENOENT;
>  	}
>  
> +	/*
> +	 * Paired with smp_wmb() in __d_set_inode_and_type() to ensure
> +	 * d_backing_inode is not NULL after the checking of d_flags
> +	 * in d_is_negative() completes.
> +	 */
> +	smp_rmb();
> +
>  	path->mnt = mnt;
>  	path->dentry = dentry;
>  	err = follow_managed(path, nd);
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ