[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f35d25eb088df4d541672d39c1c4dbb3bf2ee749.camel@hammerspace.com>
Date: Wed, 16 Jun 2021 18:51:38 +0000
From: Trond Myklebust <trondmy@...merspace.com>
To: "hsiangkao@...ux.alibaba.com" <hsiangkao@...ux.alibaba.com>,
"tytso@....edu" <tytso@....edu>
CC: "linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
"joseph.qi@...ux.alibaba.com" <joseph.qi@...ux.alibaba.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"anna.schumaker@...app.com" <anna.schumaker@...app.com>
Subject: Re: [PATCH] nfs: set block size according to pnfs_blksize first
On Thu, 2021-06-17 at 01:51 +0800, Gao Xiang wrote:
> Hi Ted,
>
> On Wed, Jun 16, 2021 at 12:14:44PM -0400, Theodore Ts'o wrote:
> > On Wed, Jun 16, 2021 at 10:41:34PM +0800, Gao Xiang wrote:
> > > > > Yet my question is how to deal with generic/486, should we
> > > > > just skip
> > > > > the case directly? I cannot find some proper way to get
> > > > > underlayfs
> > > > > block size or real xattr value limit.
> >
> > Note that the block size does not necessarily have thing to do with
> > the xattr value limit. For example, in ext4, if you create the
> > file
> > system with the ea_inode feature enabled, you can create extended
> > attributes up to the maximum value of 64k defined by the xattr
> > interface --- unless, of course, there isn't enough space in the
> > file
> > system.
> >
> > (The ea_inode feature will also use shared space for large xattrs,
> > so
> > if you are storing hundreds of thousands of files that all have an
> > identical 20 kilbyte Windows security id or ACL, ea_inode is going
> > to
> > be much more efficient way of supporting that particular use case.)
>
> Thanks for your detailed explanation too.
>
> Yeah, yet it's not enabled in the issue setup I was assigned :(
>
> >
> > > > As noted above, the NFS server should really be returning
> > > > NFS4ERR_XATTR2BIG in this case, which the client, again, should
> > > > be
> > > > transforming into -E2BIG. Where does ENOSPC come from?
> > >
> > > Thanks for the detailed explanation...
> > >
> > > I think that is due to ext4 returning ENOSPC since I tested
> >
> > It's not just ext4. From inspection, under some circumstances f2fs
> > and btrfs can return ENOSPC.
>
> I did some test on ext4 only earlier since it's our test environment.
> I didn't mean the ENOSPC behavior was ext4 only ( :-) very sorry
> about
> that if some misunderstanding here )
>
> >
> > The distinction is that E2BIG is (from the attr_set man page):
> >
> > [E2BIG] The value of the given attribute is too
> > large,
> > it exceeds the maximum allowable size of an
> > attribute value.
> >
> > The maximum allowable size (enforced by the VFS) is XATTR_MAX_SIZE,
> > which is 65536 bytes. Some file systems may impose a smaller max
> > allowable size.
> >
> > In contrast ENOSPC means something different:
> >
> > ENOSPC No space left on device.
> >
> > This isn't necessarily just block space, BTW; it might some other
> > file
> > system space --- thre might not be a free inode in the case of
> > ext4's
> > ea_inode feature. Or it be the f2fs file system not being able to
> > allocate a node id via f2fs_alloc_nid().
> >
> > Note that generic/486 is testing a very specific case, which is
> > replacing a small xattr (16 bytes) with an xattr with a large
> > value.
> > This is would be a really interesting test for ext4 ea_inode, when
> > we
> > are replacing an xattr stored inline in the inode, or in a single
> > 4k
> > block, with an xattr stored in a separate inode. But not the way
> > src/attr_replace_test.c (which does all of the heavy lifting for
> > the
> > generic/486 test) is currently written.
> >
>
> Yeah, as for the original case, it tried to test when it turned into
> the XFS extents format, but I'm not sure if it's just the regression
> test for such report only (similiar to ext4 inline xattr to non-
> inline
> xattr case.) rather than test the XFS_DINODE_FMT_BTREE case or
> ea_inode
> case...
>
> > So what I would suggest is to have attr_replace_test.c try to
> > determine the maximum attr value size using binary search. Start
> > with
> > min=16, and max=65536, and try creating an xattr of a particular
> > size,
> > and then delete the xattr, and then retry creating the xattr with
> > the
> > next binary search size. This will allow you to create a function
> > which determines the maximum size attr for a particular file
> > system,
> > especially in those cases where it is dependent on how the file
> > system
> > is configured.
>
> Considering the original XFS regression report [1], I think
> underlayfs blksize may still be needed. And binary search to get the
> maximum attr value may be another new case for this as well. Or
> alternatively just add by block size to do a trip test.
>
> Although I have no idea if we can just skip the case when testing
> with
> NFS. If getting underlayfs blksize is unfeasible, I think we might
> skip such case for now since nfs blksize is not useful for
> generic/486.
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=199119
I'm still not seeing why we care about block sizes, and I'm certainly
not seeing why we should care about them in the NFS case.
xattrs are a form of key-value storage with atomic semantics (i.e. when
you attempt to get/set/update/remove them, then the operation either
succeeds or it fails without any side-effects).
There is no interface for doing any form of block I/O to an xattr.
There is no requirement in the documentation that the user know
anything about block size in order to use them.
In other words, if this xfstest 'generic/486' is depending on knowledge
of the block size, then it is fundamentally a filesystem
implementation-specific test. It doesn't belong in the 'generic' test
category, because it is not testing anything that is generic to xattrs.
>
> Thanks,
> Gao Xiang
>
> >
> > > should we transform it to E2BIG instead (at least in NFS
> > > protocol)? but I'm still not sure that E2BIG is a valid return
> > > code for
> > > setxattr()...
> >
> > E2BIG is defined in the attr_set(3) man page. ENOSPC isn't
> > mentioned
> > in the attr_set man page, but given that multiple file systems
> > return
> > ENOSPC, and ENOSPC has a well-defined meaning in POSIX.1 which very
> > much makes sense when creating extended attributes, we should fix
> > that
> > by adding ENOSPC to the attr_set(3) man page (which is shipped as
> > part
> > of the libattr library sources).
> >
> > Cheers,
> >
> > - Ted
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@...merspace.com
Powered by blists - more mailing lists