lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 26 Sep 2017 10:31:14 +1000
From:   Dave Chinner <david@...morbit.com>
To:     Ross Zwisler <ross.zwisler@...ux.intel.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        "J. Bruce Fields" <bfields@...ldses.org>,
        Christoph Hellwig <hch@....de>,
        Dan Williams <dan.j.williams@...el.com>,
        Jan Kara <jack@...e.cz>, Jeff Layton <jlayton@...chiereds.net>,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        linux-nvdimm@...ts.01.org, linux-xfs@...r.kernel.org
Subject: Re: [PATCH 7/7] xfs: re-enable XFS per-inode DAX

On Mon, Sep 25, 2017 at 05:14:04PM -0600, Ross Zwisler wrote:
> Re-enable the XFS per-inode DAX flag, preventing S_DAX from changing when
> any mappings are present.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@...ux.intel.com>
> ---
>  fs/xfs/xfs_ioctl.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 386b437..7a24dd5 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1012,12 +1012,10 @@ xfs_diflags_to_linux(
>  		inode->i_flags |= S_NOATIME;
>  	else
>  		inode->i_flags &= ~S_NOATIME;
> -#if 0	/* disabled until the flag switching races are sorted out */
>  	if ((xflags & FS_XFLAG_DAX) || (ip->i_mount->m_flags & XFS_MOUNT_DAX))
>  		inode->i_flags |= S_DAX;
>  	else
>  		inode->i_flags &= ~S_DAX;
> -#endif
>  }
>  
>  static bool
> @@ -1049,6 +1047,8 @@ xfs_ioctl_setattr_xflags(
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
>  	uint64_t		di_flags2;
> +	struct address_space	*mapping = VFS_I(ip)->i_mapping;
> +	bool			dax_changing;
>  
>  	/* Can't change realtime flag if any extents are allocated. */
>  	if ((ip->i_d.di_nextents || ip->i_delayed_blks) &&
> @@ -1084,10 +1084,23 @@ xfs_ioctl_setattr_xflags(
>  	if (di_flags2 && ip->i_d.di_version < 3)
>  		return -EINVAL;
>  
> +	dax_changing = xfs_is_dax_state_changing(fa->fsx_xflags, ip);
> +	if (dax_changing) {
> +		i_mmap_lock_read(mapping);
> +		if (mapping_mapped(mapping)) {
> +			i_mmap_unlock_read(mapping);
> +			return -EBUSY;
> +		}
> +	}
> +
>  	ip->i_d.di_flags = xfs_flags2diflags(ip, fa->fsx_xflags);
>  	ip->i_d.di_flags2 = di_flags2;
>  
>  	xfs_diflags_to_linux(ip);
> +
> +	if (dax_changing)
> +		i_mmap_unlock_read(mapping);

Is this safe to be taking here under the ILOCK_EXCL? i.e. this is
the lock order here:

IOLOCK_EXCL -> MMAPLOCK_EXCL -> ILOCK_EXCL -> i_mmap_rwsem

The truncate path must run outside the ILOCK
context, and it does this order via unmap_mapping_range:

IOLOCK_EXCL -> MMAPLOCK_EXCL -> i_mmap_rwsem

On a page fault, we do:

	mmap_sem -> MMAPLOCK_EXCL -> page lock -> ILOCK_EXCL

Which gives the order

IOLOCK_EXCL
  -> mmap_sem
     -> MMAPLOCK_EXCL
	-> page lock
	   -> ILOCK_EXCL
	      -> i_mmap_rwsem

What I'm not clear on is what the orders between page locks and
pte locks and i_mapping_tree_lock and i_mmap_rwsem. If there's any
locks that the filesystem can take above the ILOCK that are also
taken under the i_mmap_rwsem, then we have a deadlock vector.

Historically we've avoided any mm/ level interactions under the
ILOCK_EXCL because of it's location in the page fault path locking
order (e.g. lockdep will go nuts if we take a page fault with the
ILOCK held). Hence I'm extremely wary of putting any other mm/ level
locks under the ILOCK like this without a clear explanation of the
locking orders and why it won't deadlock....

Cheers,

Dave.

-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ