lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Apr 2020 09:59:27 -0700
From:   "Darrick J. Wong" <darrick.wong@...cle.com>
To:     Ira Weiny <ira.weiny@...el.com>
Cc:     Dave Chinner <david@...morbit.com>, linux-kernel@...r.kernel.org,
        Dan Williams <dan.j.williams@...el.com>,
        Christoph Hellwig <hch@....de>,
        "Theodore Y. Ts'o" <tytso@....edu>, Jan Kara <jack@...e.cz>,
        Jeff Moyer <jmoyer@...hat.com>, linux-ext4@...r.kernel.org,
        linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH V6 6/8] fs/xfs: Combine xfs_diflags_to_linux() and
 xfs_diflags_to_iflags()

On Thu, Apr 09, 2020 at 08:29:44AM -0700, Ira Weiny wrote:
> On Wed, Apr 08, 2020 at 05:30:21PM -0700, Darrick J. Wong wrote:
> 
> [snip]
> 
> > 
> > But you're right, this thing keeps swirling around and around and around
> > because we can't ever get to agreement on this.  Maybe I'll just become
> > XFS BOFH MAINTAINER and make a decision like this:
> > 
> >  1 Applications must call statx to discover the current S_DAX state.
> > 
> >  2 There exists an advisory file inode flag FS_XFLAG_DAX that is set based on
> >    the parent directory FS_XFLAG_DAX inode flag.  This advisory flag can be
> >    changed after file creation, but it does not immediately affect the S_DAX
> >    state.
> > 
> >    If FS_XFLAG_DAX is set and the fs is on pmem then it will enable S_DAX at
> >    inode load time; if FS_XFLAG_DAX is not set, it will not enable S_DAX.
> >    Unless overridden...
> > 
> >  3 There exists a dax= mount option.
> > 
> >    "-o dax=never" means "never set S_DAX, ignore FS_XFLAG_DAX"
> >    "-o dax=always" means "always set S_DAX (at least on pmem), ignore FS_XFLAG_DAX"
> >         "-o dax" by itself means "dax=always"
> >    "-o dax=iflag" means "follow FS_XFLAG_DAX" and is the default
> 
> per-Dave '-o dax=inode'

Ok.

> > 
> >  4 There exists an advisory directory inode flag FS_XFLAG_DAX that can be
> >    changed at any time.  The flag state is copied into any files or
> >    subdirectories when they are created within that directory.
> 
> Good.
> 
> >    If programs
> >    require file access runs in S_DAX mode, they must create those files
> >    inside a directory with FS_XFLAG_DAX set, or mount the fs with an
> >    appropriate dax mount option.
> 
> Why do we need this to be true?  If the FS_XFLAG_DAX flag can be cleared why
> not set it and allow the S_DAX change to occur later just like clearing it?
> The logic is exactly the same.

I think I'll just delete this sentence since I started pushing all that
verbiage towards (5).

To answer your question, yes, FS_XFLAG_DAX can be set on a file at any
time, the same as it can be cleared at any time.  Sorry that was
unclear, I'll fix that for the next draft (below).

> > 
> >  5 Programs that require a specific file access mode (DAX or not DAX) must
> 
> s/must/can/

Ok.

> >    do one of the following:
> > 
> >    (a) create files in directories with the FS_XFLAG_DAX flag set as needed;
> 
> Again if we allow clearing the flag why not setting?  So this is 1 option they
> 'can' do.
> 
> > 
> >    (b) have the administrator set an override via mount option;
> > 
> >    (c) if they need to change a file's FS_XFLAG_DAX flag so that it does not
> >        match the S_DAX state (as reported by statx), they must cause the
> >        kernel to evict the inode from memory.  This can be done by:
> > 
> >        i>   closing the file;
> >        ii>  re-opening the file and using statx to see if the fs has
> >             changed the S_DAX flag;
> 
> i and ii need to be 1 step the user must follow.

Ok, I'll combine these.

> >        iii> if not, either unmount and remount the filesystem, or
> >             closing the file and using drop_caches.
> > 
> >  6 I no longer think it's too wild to require that users who want to
> >    squeeze every last bit of performance out of the particular rough and
> >    tumble bits of their storage also be exposed to the difficulties of
> >    what happens when the operating system can't totally virtualize those
> >    hardware capabilities.  Your high performance sports car is not a
> >    Toyota minivan, as it were.
> 
> I'm good with this statement.  But I think we need to clean up the verbiage for
> the documentation...  ;-)

Heh. :)

> Thanks for the summary.  I like these to get everyone on the same page.  :-D

And today:

 1. There exists an in-kernel access mode flag S_DAX that is set when
    file accesses go directly to persistent memory, bypassing the page
    cache.  Applications must call statx to discover the current S_DAX
    state (STATX_ATTR_DAX).

 2. There exists an advisory file inode flag FS_XFLAG_DAX that is
    inherited from the parent directory FS_XFLAG_DAX inode flag at file
    creation time.  This advisory flag can be set or cleared at any
    time, but doing so does not immediately affect the S_DAX state.

    Unless overridden by mount options (see (3)), if FS_XFLAG_DAX is set
    and the fs is on pmem then it will enable S_DAX at inode load time;
    if FS_XFLAG_DAX is not set, it will not enable S_DAX.

 3. There exists a dax= mount option.

    "-o dax=never"  means "never set S_DAX, ignore FS_XFLAG_DAX."

    "-o dax=always" means "always set S_DAX (at least on pmem),
                    and ignore FS_XFLAG_DAX."

    "-o dax"        is an alias for "dax=always".

    "-o dax=inode"  means "follow FS_XFLAG_DAX" and is the default.

 4. There exists an advisory directory inode flag FS_XFLAG_DAX that can
    be set or cleared at any time.  The flag state is copied into any
    files or subdirectories when they are created within that directory.

 5. Programs that require a specific file access mode (DAX or not DAX)
    can do one of the following:

    (a) Create files in directories that the FS_XFLAG_DAX flag set as
        needed; or

    (b) Have the administrator set an override via mount option; or

    (c) Set or clear the file's FS_XFLAG_DAX flag as needed.  Programs
        must then cause the kernel to evict the inode from memory.  This
        can be done by:

        i>  Closing the file and re-opening the file and using statx to
            see if the fs has changed the S_DAX flag; and

        ii> If the file still does not have the desired S_DAX access
            mode, either unmount and remount the filesystem, or close
            the file and use drop_caches.

 6. It's not unreasonable that users who want to squeeze every last bit
    of performance out of the particular rough and tumble bits of their
    storage also be exposed to the difficulties of what happens when the
    operating system can't totally virtualize those hardware
    capabilities.  Your high performance sports car is not a Toyota
    minivan, as it were.

Given our overnight discussions, I don't think it'll be difficult to
hoist XFS_IDONTCACHE to the VFS so that 5.c.i is enough to change the
S_DAX state if nobody else is using the file.

--D

> > 
> > > Furthermore, if we did want an interface like that why not allow
> > > the on-disk flag to be set as well as cleared?
> > 
> > Well, why not - it's why I implemented the flag in the first place!
> > The only problem we have here is how to safely change the in-memory
> > DAX state, and that largely has nothing to do with setting/clearing
> > the on-disk flag....
> 
> With the above change to xfs_diflags_to_iflags() I think we are ok here.
> 
> Ira
> 
--D

> Ira
> 
> > 
> > I think (like Dave said) that if you set XFS_IDONTCACHE on the inode
> > when you change the DAX flag, the VFS will kill the inode the instant
> > the last user close()s the file.  Then 5.c.ii will actually work.
> > 
> > --D
> > 
> > > > 
> > > > > Furthermore, if we did want an interface like that why not allow
> > > > > the on-disk flag to be set as well as cleared?
> > > > 
> > > > Well, why not - it's why I implemented the flag in the first place!
> > > > The only problem we have here is how to safely change the in-memory
> > > > DAX state, and that largely has nothing to do with setting/clearing
> > > > the on-disk flag....
> > > 
> > > With the above change to xfs_diflags_to_iflags() I think we are ok here.
> > > 
> > > Ira
> > > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ