lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201112151046.GD27697@quack2.suse.cz>
Date:   Thu, 12 Nov 2020 16:10:46 +0100
From:   Jan Kara <jack@...e.cz>
To:     Michal Suchánek <msuchanek@...e.de>
Cc:     Dave Chinner <david@...morbit.com>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        linux-xfs@...r.kernel.org, linux-ext4@...r.kernel.org,
        Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Ira Weiny <ira.weiny@...el.com>, Jan Kara <jack@...e.cz>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] xfs: show the dax option in mount options.

On Thu 12-11-20 12:12:17, Michal Suchánek wrote:
> On Thu, Nov 12, 2020 at 12:49:52PM +1100, Dave Chinner wrote:
> > On Wed, Nov 11, 2020 at 11:28:48AM +0100, Michal Suchánek wrote:
> > > On Tue, Nov 10, 2020 at 08:08:23AM +1100, Dave Chinner wrote:
> > > > On Mon, Nov 09, 2020 at 09:27:05PM +0100, Michal Suchánek wrote:
> > > > > On Mon, Nov 09, 2020 at 11:24:19AM -0800, Darrick J. Wong wrote:
> > > > > > On Mon, Nov 09, 2020 at 08:10:08PM +0100, Michal Suchanek wrote:
> > > > > > > xfs accepts both dax and dax_enum but shows only dax_enum. Show both
> > > > > > > options.
> > > > > > > 
> > > > > > > Fixes: 8d6c3446ec23 ("fs/xfs: Make DAX mount option a tri-state")
> > > > > > > Signed-off-by: Michal Suchanek <msuchanek@...e.de>
> > > > > > > ---
> > > > > > >  fs/xfs/xfs_super.c | 2 +-
> > > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > > > > > > index e3e229e52512..a3b00003840d 100644
> > > > > > > --- a/fs/xfs/xfs_super.c
> > > > > > > +++ b/fs/xfs/xfs_super.c
> > > > > > > @@ -163,7 +163,7 @@ xfs_fs_show_options(
> > > > > > >  		{ XFS_MOUNT_GRPID,		",grpid" },
> > > > > > >  		{ XFS_MOUNT_DISCARD,		",discard" },
> > > > > > >  		{ XFS_MOUNT_LARGEIO,		",largeio" },
> > > > > > > -		{ XFS_MOUNT_DAX_ALWAYS,		",dax=always" },
> > > > > > > +		{ XFS_MOUNT_DAX_ALWAYS,		",dax,dax=always" },
> > > > > > 
> > > > > > NAK, programs that require DAX semantics for files stored on XFS must
> > > > > > call statx to detect the STATX_ATTR_DAX flag, as outlined in "Enabling
> > > > > > DAX on xfs" in Documentation/filesystems/dax.txt.
> > > > > statx can be used to query S_DAX.  NOTE that only regular files will
> > > > > ever have S_DAX set and therefore statx will never indicate that S_DAX
> > > > > is set on directories.
> > > > 
> > > > Yup, by design.
> > > > 
> > > > The application doesn't need to do anything complex to make this
> > > > work. If the app wants to use DAX, then it should use
> > > > FS_IOC_FS{GS}ETXATTR to always set the on disk per inode DAX flags
> > > > for it's data dirs and files, and then STATX_ATTR_DAX will *always*
> > > > tell it whether DAX is actively in use at runtime. It's pretty
> > > > simple, really.
> > > > 
> > > > > The filesystem may not have any files so statx cannot be used.
> > > > 
> > > > Really?  The app or installer is about to *write to the fs* and has
> > > > all the permissions it needs to modify the contents of the fs. It's
> > > > pretty simple to create a tmpdir, set the DAX flag on the tmpdir,
> > > > then create a tmpfile in the tmpdir and run STATX_ATTR_DAX on it to
> > > > see if DAX is active or not.....
> > > 
> > > Have you ever seen a 'wizard' style installer?
> > 
> > I wrote my first one in 1995 on Windows NT 3.51 using Installshield.
> > 
> > > Like one that firsts asks what to install, and then presents a list of
> > > suitable locations that have enough space, supported filesystem features
> > > enabled, and whatnot?
> > 
> > Hold on, 1995 is calling me. The application I was packaging used
> > ACLs. But the NTFS version created by windows NT 3.1 was
> > incompatible as ACL support didn't arrive until NT 3.51 and service
> > pack 4(?) for NT 3.1. Yes, I had to write code to probe the
> > filesystems to detect whether ACL support was available or not by
> > -trying to create an ACL-.
> > 
> > I guess you could say "been there, done that, learnt the lesson".
> So we are trying to be as bad as Windows now?
> > 
> > > So to present a list of mountpoints that support DAX one has to scribble
> > > over every mountpoint on the system?
> > 
> > If you are filtering storage options presented to the user by
> > supported features, then you have to probe for them in some way.
> > And that means you have to consider that many option filesystem
> > features that applications use cannot be detected via mount options
> > checking the filesytem config. That is, there are features that can
> > only be discovered by actually testing whether they work or not.
> > 
> > > That sounds ridiculous.
> > 
> > Reality is a harsh mistress. :/
> > 
> > [snip the rest because you're being ridiculous]
> > 
> > Are you aware of ndctl?
> > 
> > $ ndctl list
> > [
> >   {
> >     "dev":"namespace1.0",
> >     "mode":"fsdax",
> >     "map":"mem",
> >     "size":8589934592,
> >     "sector_size":512,
> >     "blockdev":"pmem1"
> >   },
> >   {
> >     "dev":"namespace0.0",
> >     "mode":"fsdax",
> >     "map":"mem",
> >     "size":8589934592,
> >     "sector_size":512,
> >     "blockdev":"pmem0"
> >   }
> > ]
> Yes, that tells me that the device can be configured for dax. Not if the
> filesystem will use it.
> > 
> > Oh, look there are two block devices on this machine that are
> > configured for filesystem DAX (fsdax). They are /dev/pmem0 and
> > /dev/pmem1.
> > 
> > What filesytsems are on them?
> > 
> > $ lsblk -o NAME,SIZE,FSTYPE,MOUNTPOINT /dev/pmem0 /dev/pmem1
> > NAME  SIZE FSTYPE MOUNTPOINT
> > pmem1   8G ext4   /mnt/test
> > pmem0   8G xfs    /mnt/scratch
> > $
> > 
> > One XFs, one ext4, both of which will be using DAX capable unless
> > the dax=never mount option is set. Which:
> Or the bock size does not match page size. Or whatever other requirement
> the filesystem might have is not met.
> > 
> > $ mount  |grep pmem
> > /dev/pmem0 on /mnt/scratch type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota)
> > /dev/pmem1 on /mnt/test type ext4 (rw,relatime)
> > $
> > 
> > is not set on either mount.
> > 
> > Hence both filesystems at DAX capable and enabled, and should be
> > presented as options to the user as such.
> No, it is not the case. That is why it would make sense for the kernel
> to make the information about DAX availability accessible somewhere.
> > 
> > And all this comes about because DAX is a property of the block
> > device, not the filesystem. Hence the only time a DAX capable
> > filesystem on a block device that is DAX capable will not be DAX
> > capable is if the dax=never is set...
> See, it is not property of the block device. It is property of the mount
> point. The availability on the device is one requirement but the
> filesystem options affect availability to the user in the end.

No, it is not really a property of the mountpoint either. If anything it is
a property of the inode. Two different inodes on the very same filesystem,
one may support DAX the other will not (think for example of XFS real-time
volumes, or simply inodes with / without S_DAX flag set). And we are back
at what Dave tries to get accross. As inconvenient as it is
statx(STATX_ATTR_DAX) is the only way to tell.

> > Of course, this is just encoding how existing filesystems behave -
> > it's not a requirement for future filesytsems so they may use other
> > mechanisms for enabling/disabling DAX. Which leaves you with the
> > only reliable mechanism of creating filesystem and checking
> > statx(STATX_ATTR_DAX)....
> Or the kernel could just tell the user. But right, information is power,
> and keeping the user in the dark is much more entertaining.

I think it would be more productive if you actually answered Ted's
question: Exactly which application got broken by the change? I know for a
fact that one large DB vendor was parsing mount options in /proc/mounts to
determine whether their DB can use DAX or not (and this was already a
"cleaned up" method because before this they were parsing VMA flags in
/proc/<pid>/smaps which is even worse). But in this case they also seemed
OK to switch to statx() once it is available...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ