[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6c5375da-882e-2063-8ebf-007d0e6aa7e9@sandeen.net>
Date: Tue, 26 Sep 2017 13:02:50 -0500
From: Eric Sandeen <sandeen@...deen.net>
To: Dave Chinner <david@...morbit.com>, Jan Kara <jack@...e.cz>
Cc: Ross Zwisler <ross.zwisler@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
"Darrick J. Wong" <darrick.wong@...cle.com>,
"J. Bruce Fields" <bfields@...ldses.org>,
Christoph Hellwig <hch@....de>,
Dan Williams <dan.j.williams@...el.com>,
Jeff Layton <jlayton@...chiereds.net>,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-nvdimm@...ts.01.org, linux-xfs@...r.kernel.org
Subject: Re: [PATCH 1/7] xfs: always use DAX if mount option is used
On 9/26/17 6:09 AM, Dave Chinner wrote:
> On Tue, Sep 26, 2017 at 11:35:48AM +0200, Jan Kara wrote:
>> On Tue 26-09-17 09:38:12, Dave Chinner wrote:
>>> On Mon, Sep 25, 2017 at 05:13:58PM -0600, Ross Zwisler wrote:
>>>> Before support for the per-inode DAX flag was disabled the XFS the code had
>>>> an issue where the user couldn't reliably tell whether or not DAX was being
>>>> used to service page faults and I/O when the DAX mount option was used. In
>>>> this case each inode within the mounted filesystem started with S_DAX set
>>>> due to the mount option, but it could be cleared if someone touched the
>>>> individual inode flag.
>>>>
>>>> For example (v4.13 and before):
>>>>
>>>> # mount | grep dax
>>>> /dev/pmem0 on /mnt type xfs
>>>> (rw,relatime,seclabel,attr2,dax,inode64,sunit=4096,swidth=4096,noquota)
>>>>
>>>> # touch /mnt/a /mnt/b # both files currently use DAX
>>>>
>>>> # xfs_io -c "lsattr" /mnt/* # neither has the DAX inode option set
>>>> ----------e----- /mnt/a
>>>> ----------e----- /mnt/b
>>>>
>>>> # xfs_io -c "chattr -x" /mnt/a # this clears S_DAX for /mnt/a
>>>>
>>>> # xfs_io -c "lsattr" /mnt/*
>>>> ----------e----- /mnt/a
>>>> ----------e----- /mnt/b
>>>
>>> That's really a bug in the lsattr code, yes? If we've cleared the
>>> S_DAX flag for the inode, then why is it being reported in lsattr?
>>> Or if we failed to clear the S_DAX flag in the 'chattr -x' call,
>>> then isn't that the bug that needs fixing?
>>>
>>> Remember, the whole point of the dax inode flag was to be able to
>>> override the mount option setting so that admins could turn off/on
>>> dax for the things that didn't/did work with DAX correctly so they
>>> didn't need multiple filesystems on pmem to segregate the apps that
>>> did/didn't work with DAX...
>>
>> So I think there is some confusion that is created by the fact that whether
>> DAX is used or not is controlled by both a mount option and an inode flag.
>> We could define that "Inode flag always wins" which is what you seem to
>> suggest above but then mount option has no practical effect since on-disk
>> S_DAX flag will always overrule it.
>
> Well, quite frankly, I never wanted the mount option for XFS. It was
> supposed to be for initial testing only, then we'd /always/ use the
> the inode flags. For a filesystem to default to using DAX, we
> set the DAX flag on the root inode at mkfs time, and then everything
> inode flag based just works.
>
> But it seems that we're now stuck with the stupid, blunt, brute
> force mount option because that's what the first commit on ext4
> used. Now we're just about stuck with this silly "but we can't turn
> it off" problem because of the mount option overriding everything.
I don't think the existence of a mount option in ext4 makes us any
more "stuck" than the mount option in xfs does.
fs/xfs/xfs_super.c: "DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
fs/ext4/super.c: "DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
so when^wif this argument ever gets settled, I think there is plenty
of latitude to do the right thing, potentially breaking the old thing.
> If we have to keep the mount option, then lets fix it to mean "mount
> option sets inheritable inode flag on directory creation" and
> /maybe/ "mount option sets inode flag on file creation".
>
> This then allows the inode flag to control everything else. i.e the
> mount option sets the initial flag value rather than the behaviour
> of the inode. The behaviour of the inode should be entirely
> controlled by the inode flag, hence after initial creation the
> chattr +/-x commands do what they advertise regardless of the mount
> option value.
>
> Yes, it means that existing users are going to have to run chattr -R
> +x on their pmem filesystems to get the inode flags on disk, but
> this is all tagged with EXPERIMENTAL and this is the sort of change
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> that is expected from experimental functionality.
Right.
-Eric
> Cheers,
>
> Dave.
>
Powered by blists - more mailing lists