lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241210A1712306692debd.njha@janestreet.com>
Date: Tue, 10 Dec 2024 12:12:30 -0500
From: Nikhil Jha <njha@...estreet.com>
To: Benjamin Coddington <bcodding@...hat.com>
Cc: Trond Myklebust <trondmy@...nel.org>, Anna Schumaker <anna@...nel.org>,
 	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nfs: propagate fileid changed errors back to syscall

On Tue, Dec 10, 2024 at 07:11:43AM -0500, Benjamin Coddington wrote:
> On 9 Dec 2024, at 12:39, Nikhil Jha wrote:
> 
> > Hello! This is the first kernel patch I have tried to upstream. I'm
> > following along with the kernel newbies guide but apologies if I got
> > anything wrong.
> >
> > Currently, if there is a mismatch in the request and response fileids in
> > an NFS request, the kernel logs an error and attempts to return ESTALE.
> > However, this error is currently dropped before it makes it all the way
> > to userspace. This appears to be a mistake, since as far as I can tell
> > that ESTALE value is never consumed from anywhere.
> >
> > Callstack for async NFS write, at time of error:
> >
> >         nfs_update_inode <- returns -ESTALE
> >         nfs_refresh_inode_locked
> >         nfs_writeback_update_inode <- error is dropped here
> >         nfs3_write_done
> >         nfs_writeback_done
> >         nfs_pgio_result <- other errors are collected here
> >         rpc_exit_task
> >         __rpc_execute
> >         rpc_async_schedule
> >         process_one_work
> >         worker_thread
> >         kthread
> >         ret_from_fork
> >
> > We ran into this issue ourselves, and seeing the -ESTALE in the kernel
> > source code but not from userspace was surprising.
> >
> > I tested a rebased version of this patch on an el8 kernel (v6.1.114),
> > and it seems to correctly propagate this error.
> >
> >> 8------------------------------------------------------8<
> >
> > If an NFS server returns a response with a different file id to the
> > response, the kernel currently prints out an error and attempts to
> > return -ESTALE. However, this -ESTALE value is never surfaced anywhere.
> 
> Hi Nikhil Jha,
> 
> Will this cause us to return -ESTALE to the application even if the WRITE
> was successful?
> 
> Ben
> 

Hi Ben,

Hmm.. I'm not sure how to answer that question exactly. A fileid is only
mismatched between a request RPC and response RPC when something really
weird is going on (i.e. a bug in the NFS server), so it's hard to reason
about if a WRITE was successful or not.

The `return -ESTALE` was already in the kernel code, but this particular
codepath seems to have accidentally dropped that error.

Nikhil


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ