netdev - Re: [RFC PATCH 02/53] netfs: Track the fpos above which the server has no data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <11ec6f637698feb04963c6a7c39a5ca80af95464.camel@kernel.org>
Date: Mon, 16 Oct 2023 11:43:05 -0400
From: Jeff Layton <jlayton@...nel.org>
To: David Howells <dhowells@...hat.com>, Steve French <smfrench@...il.com>
Cc: Matthew Wilcox <willy@...radead.org>, Marc Dionne
 <marc.dionne@...istor.com>,  Paulo Alcantara <pc@...guebit.com>, Ronnie
 Sahlberg <lsahlber@...hat.com>, Shyam Prasad N <sprasad@...rosoft.com>, Tom
 Talpey <tom@...pey.com>, Dominique Martinet <asmadeus@...ewreck.org>, Ilya
 Dryomov <idryomov@...il.com>, Christian Brauner <christian@...uner.io>,
 linux-afs@...ts.infradead.org,  linux-cifs@...r.kernel.org,
 linux-nfs@...r.kernel.org,  ceph-devel@...r.kernel.org,
 v9fs@...ts.linux.dev, linux-fsdevel@...r.kernel.org,  linux-mm@...ck.org,
 netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
 linux-cachefs@...hat.com
Subject: Re: [RFC PATCH 02/53] netfs: Track the fpos above which the server
 has no data

On Fri, 2023-10-13 at 16:56 +0100, David Howells wrote:
> Track the file position above which the server is not expected to have any
> data and preemptively assume that we can simply fill blocks with zeroes
> locally rather than attempting to download them - even if we've written
> data back to the server.  Assume that any data that was written back above
> that position is held in the local cache.  Call this the "zero point".
> 
> Make use of this to optimise away some reads from the server.  We need to
> set the zero point in the following circumstances:
> 
>  (1) When we see an extant remote inode and have no cache for it, we set
>      the zero_point to i_size.
> 
>  (2) On local inode creation, we set zero_point to 0.
> 
>  (3) On local truncation down, we reduce zero_point to the new i_size if
>      the new i_size is lower.
> 
>  (4) On local truncation up, we don't change zero_point.
> 
>  (5) On local modification, we don't change zero_point.
> 
>  (6) On remote invalidation, we set zero_point to the new i_size.
> 
>  (7) If stored data is culled from the local cache, we must set zero_point
>      above that if the data also got written to the server.
> 

When you say culled here, it sounds like you're just throwing out the
dirty cache without writing the data back. That shouldn't be allowed
though, so I must be misunderstanding what you mean here. Can you
explain?

>  (8) If dirty data is written back to the server, but not the local cache,
>      we must set zero_point above that.
> 

How do you write back without writing to the local cache? I'm guessing
this means you're doing a non-buffered write?

> Assuming the above, any read from the server at or above the zero_point
> position will return all zeroes.
> 
> The zero_point value can be stored in the cache, provided the above rules
> are applied to it by any code that culls part of the local cache.
> 
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Jeff Layton <jlayton@...nel.org>
> cc: linux-cachefs@...hat.com
> cc: linux-fsdevel@...r.kernel.org
> cc: linux-mm@...ck.org
> ---
>  fs/afs/inode.c           | 13 +++++++------
>  fs/netfs/buffered_read.c | 40 +++++++++++++++++++++++++---------------
>  include/linux/netfs.h    |  5 +++++
>  3 files changed, 37 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/afs/inode.c b/fs/afs/inode.c
> index 1c794a1896aa..46bc5574d6f5 100644
> --- a/fs/afs/inode.c
> +++ b/fs/afs/inode.c
> @@ -252,6 +252,7 @@ static void afs_apply_status(struct afs_operation *op,
>  		vnode->netfs.remote_i_size = status->size;
>  		if (change_size) {
>  			afs_set_i_size(vnode, status->size);
> +			vnode->netfs.zero_point = status->size;
>  			inode_set_ctime_to_ts(inode, t);
>  			inode->i_atime = t;
>  		}
> @@ -865,17 +866,17 @@ static void afs_setattr_success(struct afs_operation *op)
>  static void afs_setattr_edit_file(struct afs_operation *op)
>  {
>  	struct afs_vnode_param *vp = &op->file[0];
> -	struct inode *inode = &vp->vnode->netfs.inode;
> +	struct afs_vnode *vnode = vp->vnode;
>  
>  	if (op->setattr.attr->ia_valid & ATTR_SIZE) {
>  		loff_t size = op->setattr.attr->ia_size;
>  		loff_t i_size = op->setattr.old_i_size;
>  
> -		if (size < i_size)
> -			truncate_pagecache(inode, size);
> -		if (size != i_size)
> -			fscache_resize_cookie(afs_vnode_cache(vp->vnode),
> -					      vp->scb.status.size);
> +		if (size != i_size) {
> +			truncate_pagecache(&vnode->netfs.inode, size);
> +			netfs_resize_file(&vnode->netfs, size);
> +			fscache_resize_cookie(afs_vnode_cache(vnode), size);
> +		}

Isn't this an existing bug? AFS is not setting remote_i_size in the
setattr path currently? I think this probably ought to be done in a
preliminary AFS patch.

>
>  	}
>  }
>  
> diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
> index 2cd3ccf4c439..a2852fa64ad0 100644
> --- a/fs/netfs/buffered_read.c
> +++ b/fs/netfs/buffered_read.c
> @@ -147,6 +147,22 @@ static void netfs_rreq_expand(struct netfs_io_request *rreq,
>  	}
>  }
>  
> +/*
> + * Begin an operation, and fetch the stored zero point value from the cookie if
> + * available.
> + */
> +static int netfs_begin_cache_operation(struct netfs_io_request *rreq,
> +				       struct netfs_inode *ctx)
> +{
> +	int ret = -ENOBUFS;
> +
> +	if (ctx->ops->begin_cache_operation) {
> +		ret = ctx->ops->begin_cache_operation(rreq);
> +		/* TODO: Get the zero point value from the cache */
> +	}
> +	return ret;
> +}
> +
>  /**
>   * netfs_readahead - Helper to manage a read request
>   * @ractl: The description of the readahead request
> @@ -180,11 +196,9 @@ void netfs_readahead(struct readahead_control *ractl)
>  	if (IS_ERR(rreq))
>  		return;
>  
> -	if (ctx->ops->begin_cache_operation) {
> -		ret = ctx->ops->begin_cache_operation(rreq);
> -		if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS)
> -			goto cleanup_free;
> -	}
> +	ret = netfs_begin_cache_operation(rreq, ctx);
> +	if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS)
> +		goto cleanup_free;
>  
>  	netfs_stat(&netfs_n_rh_readahead);
>  	trace_netfs_read(rreq, readahead_pos(ractl), readahead_length(ractl),
> @@ -238,11 +252,9 @@ int netfs_read_folio(struct file *file, struct folio *folio)
>  		goto alloc_error;
>  	}
>  
> -	if (ctx->ops->begin_cache_operation) {
> -		ret = ctx->ops->begin_cache_operation(rreq);
> -		if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS)
> -			goto discard;
> -	}
> +	ret = netfs_begin_cache_operation(rreq, ctx);
> +	if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS)
> +		goto discard;
>  
>  	netfs_stat(&netfs_n_rh_readpage);
>  	trace_netfs_read(rreq, rreq->start, rreq->len, netfs_read_trace_readpage);
> @@ -390,11 +402,9 @@ int netfs_write_begin(struct netfs_inode *ctx,
>  	rreq->no_unlock_folio	= folio_index(folio);
>  	__set_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, &rreq->flags);
>  
> -	if (ctx->ops->begin_cache_operation) {
> -		ret = ctx->ops->begin_cache_operation(rreq);
> -		if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS)
> -			goto error_put;
> -	}
> +	ret = netfs_begin_cache_operation(rreq, ctx);
> +	if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS)
> +		goto error_put;
>  
>  	netfs_stat(&netfs_n_rh_write_begin);
>  	trace_netfs_read(rreq, pos, len, netfs_read_trace_write_begin);
> diff --git a/include/linux/netfs.h b/include/linux/netfs.h
> index b447cb67f599..282511090ead 100644
> --- a/include/linux/netfs.h
> +++ b/include/linux/netfs.h
> @@ -129,6 +129,8 @@ struct netfs_inode {
>  	struct fscache_cookie	*cache;
>  #endif
>  	loff_t			remote_i_size;	/* Size of the remote file */
> +	loff_t			zero_point;	/* Size after which we assume there's no data
> +						 * on the server */

While I understand the concept, I'm not yet sure I understand how this
new value will be used. It might be better to merge this patch in with
the patch that adds the first user of this data.

>  };
>  
>  /*
> @@ -330,6 +332,7 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
>  {
>  	ctx->ops = ops;
>  	ctx->remote_i_size = i_size_read(&ctx->inode);
> +	ctx->zero_point = ctx->remote_i_size;
>  #if IS_ENABLED(CONFIG_FSCACHE)
>  	ctx->cache = NULL;
>  #endif
> @@ -345,6 +348,8 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
>  static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size)
>  {
>  	ctx->remote_i_size = new_i_size;
> +	if (new_i_size < ctx->zero_point)
> +		ctx->zero_point = new_i_size;
>  }
>  
>  /**
> 

-- 
Jeff Layton <jlayton@...nel.org>