[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220323225843.GI1609613@dread.disaster.area>
Date: Thu, 24 Mar 2022 09:58:43 +1100
From: Dave Chinner <david@...morbit.com>
To: Miklos Szeredi <mszeredi@...hat.com>
Cc: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-api@...r.kernel.org, linux-man@...r.kernel.org,
linux-security-module@...r.kernel.org, Karel Zak <kzak@...hat.com>,
Ian Kent <raven@...maw.net>,
David Howells <dhowells@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
Christian Brauner <christian@...uner.io>,
Amir Goldstein <amir73il@...il.com>,
James Bottomley <James.Bottomley@...senpartnership.com>
Subject: Re: [RFC PATCH] getvalues(2) prototype
On Tue, Mar 22, 2022 at 08:27:12PM +0100, Miklos Szeredi wrote:
> Add a new userspace API that allows getting multiple short values in a
> single syscall.
>
> This would be useful for the following reasons:
>
> - Calling open/read/close for many small files is inefficient. E.g. on my
> desktop invoking lsof(1) results in ~60k open + read + close calls under
> /proc and 90% of those are 128 bytes or less.
How does doing the open/read/close in a single syscall make this any
more efficient? All it saves is the overhead of a couple of
syscalls, it doesn't reduce any of the setup or teardown overhead
needed to read the data itself....
> - Interfaces for getting various attributes and statistics are fragmented.
> For files we have basic stat, statx, extended attributes, file attributes
> (for which there are two overlapping ioctl interfaces). For mounts and
> superblocks we have stat*fs as well as /proc/$PID/{mountinfo,mountstats}.
> The latter also has the problem on not allowing queries on a specific
> mount.
https://xkcd.com/927/
> - Some attributes are cheap to generate, some are expensive. Allowing
> userspace to select which ones it needs should allow optimizing queries.
>
> - Adding an ascii namespace should allow easy extension and self
> description.
>
> - The values can be text or binary, whichever is fits best.
>
> The interface definition is:
>
> struct name_val {
> const char *name; /* in */
> struct iovec value_in; /* in */
> struct iovec value_out; /* out */
> uint32_t error; /* out */
> uint32_t reserved;
> };
Ahhh, XFS_IOC_ATTRMULTI_BY_HANDLE reborn. This is how xfsdump gets
and sets attributes efficiently when dumping and restoring files -
it's an interface that allows batches of xattr operations to be run
on a file in a single syscall.
I've said in the past when discussing things like statx() that maybe
everything should be addressable via the xattr namespace and
set/queried via xattr names regardless of how the filesystem stores
the data. The VFS/filesystem simply translates the name to the
storage location of the information. It might be held in xattrs, but
it could just be a flag bit in an inode field.
Then we just get named xattrs in batches from an open fd.
> int getvalues(int dfd, const char *path, struct name_val *vec, size_t num,
> unsigned int flags);
>
> @dfd and @path are used to lookup object $ORIGIN. @vec contains @num
> name/value descriptors. @flags contains lookup flags for @path.
>
> The syscall returns the number of values filled or an error.
>
> A single name/value descriptor has the following fields:
>
> @name describes the object whose value is to be returned. E.g.
>
> mnt - list of mount parameters
> mnt:mountpoint - the mountpoint of the mount of $ORIGIN
> mntns - list of mount ID's reachable from the current root
> mntns:21:parentid - parent ID of the mount with ID of 21
> xattr:security.selinux - the security.selinux extended attribute
> data:foo/bar - the data contained in file $ORIGIN/foo/bar
How are these different from just declaring new xattr namespaces for
these things. e.g. open any file and list the xattrs in the
xattr:mount.mnt namespace to get the list of mount parameters for
that mount.
Why do we need a new "xattr in everything but name" interface when
we could just extend the one we've already got and formalise a new,
cleaner version of xattr batch APIs that have been around for 20-odd
years already?
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists