[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17eb79fc-ccd9-4c85-bd23-e08380825c41@ijzerbout.nl>
Date: Fri, 15 Nov 2024 21:32:02 +0100
From: Kees Bakker <kees@...erbout.nl>
To: David Howells <dhowells@...hat.com>,
Christian Brauner <christian@...uner.io>, Steve French <smfrench@...il.com>,
Matthew Wilcox <willy@...radead.org>
Cc: Jeff Layton <jlayton@...nel.org>, Gao Xiang
<hsiangkao@...ux.alibaba.com>, Dominique Martinet <asmadeus@...ewreck.org>,
Marc Dionne <marc.dionne@...istor.com>, Paulo Alcantara <pc@...guebit.com>,
Shyam Prasad N <sprasad@...rosoft.com>, Tom Talpey <tom@...pey.com>,
Eric Van Hensbergen <ericvh@...nel.org>, Ilya Dryomov <idryomov@...il.com>,
netfs@...ts.linux.dev, linux-afs@...ts.infradead.org,
linux-cifs@...r.kernel.org, linux-nfs@...r.kernel.org,
ceph-devel@...r.kernel.org, v9fs@...ts.linux.dev,
linux-erofs@...ts.ozlabs.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 23/33] afs: Use netfslib for directories
Op 08-11-2024 om 18:32 schreef David Howells:
> In the AFS ecosystem, directories are just a special type of file that is
> downloaded and parsed locally. Download is done by the same mechanism as
> ordinary files and the data can be cached. There is one important semantic
> restriction on directories over files: the client must download the entire
> directory in one go because, for example, the server could fabricate the
> contents of the blob on the fly with each download and give a different
> image each time.
>
> So that we can cache the directory download, switch AFS directory support
> over to using the netfslib single-object API, thereby allowing directory
> content to be stored in the local cache.
>
> To make this work, the following changes are made:
>
> (1) A directory's contents are now stored in a folio_queue chain attached
> to the afs_vnode (inode) struct rather than its associated pagecache,
> though multipage folios are still used to hold the data. The folio
> queue is discarded when the directory inode is evicted.
>
> This also helps with the phasing out of ITER_XARRAY.
>
> (2) Various directory operations are made to use and unuse the cache
> cookie.
>
> (3) The content checking, content dumping and content iteration are now
> performed with a standard iov_iter iterator over the contents of the
> folio queue.
>
> (4) Iteration and modification must be done with the vnode's validate_lock
> held. In conjunction with (1), this means that the iteration can be
> done without the need to lock pages or take extra refs on them, unlike
> when accessing ->i_pages.
>
> (5) Convert to using netfs_read_single() to read data.
>
> (6) Provide a ->writepages() to call netfs_writeback_single() to save the
> data to the cache according to the VM's scheduling whilst holding the
> validate_lock read-locked as (4).
>
> (7) Change local directory image editing functions:
>
> (a) Provide a function to get a specific block by number from the
> folio_queue as we can no longer use the i_pages xarray to locate
> folios by index. This uses a cursor to remember the current
> position as we need to iterate through the directory contents.
> The block is kmapped before being returned.
>
> (b) Make the function in (a) extend the directory by an extra folio if
> we run out of space.
>
> (c) Raise the check of the block free space counter, for those blocks
> that have one, higher in the function to eliminate a call to get a
> block.
>
> (d) Remove the page unlocking and putting done during the editing
> loops. This is no longer necessary as the folio_queue holds the
> references and the pages are no longer in the pagecache.
>
> (e) Mark the inode dirty and pin the cache usage till writeback at the
> end of a successful edit.
>
> (8) Don't set the large_folios flag on the inode as we do the allocation
> ourselves rather than the VM doing it automatically.
>
> (9) Mark the inode as being a single object that isn't uploaded to the
> server.
>
> (10) Enable caching on directories.
>
> (11) Only set the upload key for writeback for regular files.
>
> Notes:
>
> (*) We keep the ->release_folio(), ->invalidate_folio() and
> ->migrate_folio() ops as we set the mapping pointer on the folio.
>
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Marc Dionne <marc.dionne@...istor.com>
> cc: Jeff Layton <jlayton@...nel.org>
> cc: linux-afs@...ts.infradead.org
> cc: netfs@...ts.linux.dev
> cc: linux-fsdevel@...r.kernel.org
> ---
> fs/afs/dir.c | 742 +++++++++++++++++++------------------
> fs/afs/dir_edit.c | 183 ++++-----
> fs/afs/file.c | 8 +
> fs/afs/inode.c | 21 +-
> fs/afs/internal.h | 16 +
> fs/afs/super.c | 2 +
> fs/afs/write.c | 4 +-
> include/trace/events/afs.h | 6 +-
> 8 files changed, 512 insertions(+), 470 deletions(-)
>
> [...]
> +/*
> + * Iterate through the directory folios under RCU conditions.
> + */
> +static int afs_dir_iterate_contents(struct inode *dir, struct dir_context *ctx)
> +{
> + struct afs_vnode *dvnode = AFS_FS_I(dir);
> + struct iov_iter iter;
> + unsigned long long i_size = i_size_read(dir);
> + int ret = 0;
>
> - do {
> - dblock = kmap_local_folio(folio, offset);
> - ret = afs_dir_iterate_block(dvnode, ctx, dblock,
> - folio_pos(folio) + offset);
> - kunmap_local(dblock);
> - if (ret != 1)
> - goto out;
> + /* Round the file position up to the next entry boundary */
> + ctx->pos = round_up(ctx->pos, sizeof(union afs_xdr_dirent));
>
> - } while (offset += sizeof(*dblock), offset < size);
> + if (i_size <= 0 || ctx->pos >= i_size)
> + return 0;
>
> - ret = 0;
> - }
> + iov_iter_folio_queue(&iter, ITER_SOURCE, dvnode->directory, 0, 0, i_size);
> + iov_iter_advance(&iter, round_down(ctx->pos, AFS_DIR_BLOCK_SIZE));
> +
> + iterate_folioq(&iter, iov_iter_count(&iter), dvnode, ctx,
> + afs_dir_iterate_step);
> +
> + if (ret == -ESTALE)
This is dead code because `ret` is set to 0 and never changed.
> + afs_invalidate_dir(dvnode, afs_dir_invalid_iter_stale);
> + return ret;
> +}
> [...]
Powered by blists - more mailing lists