netdev - Re: [PATCH v4 23/33] afs: Use netfslib for directories

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17eb79fc-ccd9-4c85-bd23-e08380825c41@ijzerbout.nl>
Date: Fri, 15 Nov 2024 21:32:02 +0100
From: Kees Bakker <kees@...erbout.nl>
To: David Howells <dhowells@...hat.com>,
 Christian Brauner <christian@...uner.io>, Steve French <smfrench@...il.com>,
 Matthew Wilcox <willy@...radead.org>
Cc: Jeff Layton <jlayton@...nel.org>, Gao Xiang
 <hsiangkao@...ux.alibaba.com>, Dominique Martinet <asmadeus@...ewreck.org>,
 Marc Dionne <marc.dionne@...istor.com>, Paulo Alcantara <pc@...guebit.com>,
 Shyam Prasad N <sprasad@...rosoft.com>, Tom Talpey <tom@...pey.com>,
 Eric Van Hensbergen <ericvh@...nel.org>, Ilya Dryomov <idryomov@...il.com>,
 netfs@...ts.linux.dev, linux-afs@...ts.infradead.org,
 linux-cifs@...r.kernel.org, linux-nfs@...r.kernel.org,
 ceph-devel@...r.kernel.org, v9fs@...ts.linux.dev,
 linux-erofs@...ts.ozlabs.org, linux-fsdevel@...r.kernel.org,
 linux-mm@...ck.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 23/33] afs: Use netfslib for directories

Op 08-11-2024 om 18:32 schreef David Howells:
> In the AFS ecosystem, directories are just a special type of file that is
> downloaded and parsed locally.  Download is done by the same mechanism as
> ordinary files and the data can be cached.  There is one important semantic
> restriction on directories over files: the client must download the entire
> directory in one go because, for example, the server could fabricate the
> contents of the blob on the fly with each download and give a different
> image each time.
>
> So that we can cache the directory download, switch AFS directory support
> over to using the netfslib single-object API, thereby allowing directory
> content to be stored in the local cache.
>
> To make this work, the following changes are made:
>
>   (1) A directory's contents are now stored in a folio_queue chain attached
>       to the afs_vnode (inode) struct rather than its associated pagecache,
>       though multipage folios are still used to hold the data.  The folio
>       queue is discarded when the directory inode is evicted.
>
>       This also helps with the phasing out of ITER_XARRAY.
>
>   (2) Various directory operations are made to use and unuse the cache
>       cookie.
>
>   (3) The content checking, content dumping and content iteration are now
>       performed with a standard iov_iter iterator over the contents of the
>       folio queue.
>
>   (4) Iteration and modification must be done with the vnode's validate_lock
>       held.  In conjunction with (1), this means that the iteration can be
>       done without the need to lock pages or take extra refs on them, unlike
>       when accessing ->i_pages.
>
>   (5) Convert to using netfs_read_single() to read data.
>
>   (6) Provide a ->writepages() to call netfs_writeback_single() to save the
>       data to the cache according to the VM's scheduling whilst holding the
>       validate_lock read-locked as (4).
>
>   (7) Change local directory image editing functions:
>
>       (a) Provide a function to get a specific block by number from the
>       	 folio_queue as we can no longer use the i_pages xarray to locate
>       	 folios by index.  This uses a cursor to remember the current
>       	 position as we need to iterate through the directory contents.
>       	 The block is kmapped before being returned.
>
>       (b) Make the function in (a) extend the directory by an extra folio if
>       	 we run out of space.
>
>       (c) Raise the check of the block free space counter, for those blocks
>       	 that have one, higher in the function to eliminate a call to get a
>       	 block.
>
>       (d) Remove the page unlocking and putting done during the editing
>       	 loops.  This is no longer necessary as the folio_queue holds the
>       	 references and the pages are no longer in the pagecache.
>
>       (e) Mark the inode dirty and pin the cache usage till writeback at the
>       	 end of a successful edit.
>
>   (8) Don't set the large_folios flag on the inode as we do the allocation
>       ourselves rather than the VM doing it automatically.
>
>   (9) Mark the inode as being a single object that isn't uploaded to the
>       server.
>
> (10) Enable caching on directories.
>
> (11) Only set the upload key for writeback for regular files.
>
> Notes:
>
>   (*) We keep the ->release_folio(), ->invalidate_folio() and
>       ->migrate_folio() ops as we set the mapping pointer on the folio.
>
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Marc Dionne <marc.dionne@...istor.com>
> cc: Jeff Layton <jlayton@...nel.org>
> cc: linux-afs@...ts.infradead.org
> cc: netfs@...ts.linux.dev
> cc: linux-fsdevel@...r.kernel.org
> ---
>   fs/afs/dir.c               | 742 +++++++++++++++++++------------------
>   fs/afs/dir_edit.c          | 183 ++++-----
>   fs/afs/file.c              |   8 +
>   fs/afs/inode.c             |  21 +-
>   fs/afs/internal.h          |  16 +
>   fs/afs/super.c             |   2 +
>   fs/afs/write.c             |   4 +-
>   include/trace/events/afs.h |   6 +-
>   8 files changed, 512 insertions(+), 470 deletions(-)
>
> [...]
> +/*
> + * Iterate through the directory folios under RCU conditions.
> + */
> +static int afs_dir_iterate_contents(struct inode *dir, struct dir_context *ctx)
> +{
> +	struct afs_vnode *dvnode = AFS_FS_I(dir);
> +	struct iov_iter iter;
> +	unsigned long long i_size = i_size_read(dir);
> +	int ret = 0;
>   
> -		do {
> -			dblock = kmap_local_folio(folio, offset);
> -			ret = afs_dir_iterate_block(dvnode, ctx, dblock,
> -						    folio_pos(folio) + offset);
> -			kunmap_local(dblock);
> -			if (ret != 1)
> -				goto out;
> +	/* Round the file position up to the next entry boundary */
> +	ctx->pos = round_up(ctx->pos, sizeof(union afs_xdr_dirent));
>   
> -		} while (offset += sizeof(*dblock), offset < size);
> +	if (i_size <= 0 || ctx->pos >= i_size)
> +		return 0;
>   
> -		ret = 0;
> -	}
> +	iov_iter_folio_queue(&iter, ITER_SOURCE, dvnode->directory, 0, 0, i_size);
> +	iov_iter_advance(&iter, round_down(ctx->pos, AFS_DIR_BLOCK_SIZE));
> +
> +	iterate_folioq(&iter, iov_iter_count(&iter), dvnode, ctx,
> +		       afs_dir_iterate_step);
> +
> +	if (ret == -ESTALE)
This is dead code because `ret` is set to 0 and never changed.
> +		afs_invalidate_dir(dvnode, afs_dir_invalid_iter_stale);
> +	return ret;
> +}
> [...]