[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKPOu+8LSKtGmtjwRpY9tMnt=1Y7RvrhDxVsfSRQW02_g5-6XA@mail.gmail.com>
Date: Mon, 9 Dec 2024 18:12:15 +0100
From: Max Kellermann <max.kellermann@...os.com>
To: David Howells <dhowells@...hat.com>
Cc: Trond Myklebust <trondmy@...nel.org>, Anna Schumaker <anna@...nel.org>,
Dave Wysochanski <dwysocha@...hat.com>, Jeff Layton <jlayton@...nel.org>,
Christian Brauner <brauner@...nel.org>, netfs@...ts.linux.dev, linux-nfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nfs: Fix oops in nfs_netfs_init_request() when copying to cache
On Mon, Dec 9, 2024 at 3:46 PM David Howells <dhowells@...hat.com> wrote:
> Does this fix the issue?
The issue is with 6.11, but this patch fails to build with 6.11 and
I'm not sure how to backport that part:
fs/nfs/fscache.c: In function ‘nfs_netfs_init_request’:
fs/nfs/fscache.c:267:50: error: ‘NETFS_PGPRIV2_COPY_TO_CACHE’
undeclared (first use in this function); did you mean
‘NETFS_RREQ_COPY_TO_CACHE’?
267 | if (WARN_ON_ONCE(rreq->origin !=
NETFS_PGPRIV2_COPY_TO_CACHE))
|
^~~~~~~~~~~~~~~~~~~~~~~~~~~
Our production machines are all 6.11, because 6.12 has that other
netfs regression that freezes all transfers immediately
(https://lore.kernel.org/netfs/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/).
I guess this other bug only affects Ceph and not NFS, but after
experiencing so many kernel regressions recently, I had to become more
cautious with kernel updates (the past 2 months had more
netfs/NFS/Ceph regression than the last 20 years combined).
>
> David
> ---
> nfs: Fix oops in nfs_netfs_init_request() when copying to cache
>
> When netfslib wants to copy some data that has just been read on behalf of
> nfs, it creates a new write request and calls nfs_netfs_init_request() to
> initialise it, but with a NULL file pointer. This causes
> nfs_file_open_context() to oops - however, we don't actually need the nfs
> context as we're only going to write to the cache.
>
> Fix this by just returning if we aren't given a file pointer and emit a
> warning if the request was for something other than copy-to-cache.
>
> Further, fix nfs_netfs_free_request() so that it doesn't try to free the
> context if the pointer is NULL.
>
> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
> Reported-by: Max Kellermann <max.kellermann@...os.com>
> Closes: https://lore.kernel.org/r/CAKPOu+986mTt1i9xGBXiQPVOmu4ZJTskrCt6f-99EL_s0rhz_A@mail.gmail.com/
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Trond Myklebust <trondmy@...nel.org>
> cc: Anna Schumaker <anna@...nel.org>
> cc: Dave Wysochanski <dwysocha@...hat.com>
> cc: Jeff Layton <jlayton@...nel.org>
> cc: linux-nfs@...r.kernel.org
> cc: netfs@...ts.linux.dev
> cc: linux-fsdevel@...r.kernel.org
> ---
> fs/nfs/fscache.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index 810269ee0a50..d49e4ce27999 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -263,6 +263,12 @@ int nfs_netfs_readahead(struct readahead_control *ractl)
> static atomic_t nfs_netfs_debug_id;
> static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file)
> {
> + if (!file) {
> + if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE))
> + return -EIO;
> + return 0;
> + }
> +
> rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file));
> rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id);
> /* [DEPRECATED] Use PG_private_2 to mark folio being written to the cache. */
> @@ -274,7 +280,8 @@ static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *fi
>
> static void nfs_netfs_free_request(struct netfs_io_request *rreq)
> {
> - put_nfs_open_context(rreq->netfs_priv);
> + if (rreq->netfs_priv)
> + put_nfs_open_context(rreq->netfs_priv);
> }
>
> static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq)
>
Powered by blists - more mailing lists