linux-kernel - Re: [PATCH] nfsd: remove unsafe BUG_ON from set_change

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <016b04630ce7e168cbaacb1a27bd95b966b8c64e.camel@kernel.org>
Date:   Thu, 20 Jul 2023 12:38:56 -0400
From:   Jeff Layton <jlayton@...nel.org>
To:     Chuck Lever III <chuck.lever@...cle.com>
Cc:     Neil Brown <neilb@...e.de>, Olga Kornievskaia <kolga@...app.com>,
        Dai Ngo <dai.ngo@...cle.com>, Tom Talpey <tom@...pey.com>,
        Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Boyang Xue <bxue@...hat.com>
Subject: Re: [PATCH] nfsd: remove unsafe BUG_ON from set_change_info

On Thu, 2023-07-20 at 15:37 +0000, Chuck Lever III wrote:
> 
> > On Jul 20, 2023, at 11:33 AM, Jeff Layton <jlayton@...nel.org> wrote:
> > 
> > On Thu, 2023-07-20 at 15:15 +0000, Chuck Lever III wrote:
> > > 
> > > > On Jul 20, 2023, at 10:59 AM, Jeff Layton <jlayton@...nel.org> wrote:
> > > > 
> > > > At one time, nfsd would scrape inode information directly out of struct
> > > > inode in order to populate the change_info4. At that time, the BUG_ON in
> > > > set_change_info made some sense, since having it unset meant a coding
> > > > error.
> > > > 
> > > > More recently, it calls vfs_getattr to get this information, which can
> > > > fail. If that fails, fh_pre_saved can end up not being set. While this
> > > > situation is unfortunate, we don't need to crash the box.
> > > 
> > > I'm always happy to get rid of a BUG_ON(). But I'm not sure even
> > > a warning is necessary in this case. It's not likely that it's
> > > a software bug or something that the server administrator can
> > > do something about.
> > > 
> > > Can you elaborate on why the vfs_getattr() might fail? Eg, how
> > > was it failing in 2223560 ?
> > > 
> > 
> > I'm fine with dropping the WARN_ON. You are correct that there is
> > probably little the admin can do about it.
> > 
> > vfs_getattr can fail for all sorts of reasons. It really depends on the
> > underlying filesystem. In 2223560, I don't know for sure, but just prior
> > to the oops, there were these messages in the log:
> > 
> > [51935.482019] XFS (vda3): Filesystem has been shut down due to log error (0x2). 
> > [51935.482020] XFS (vda3): Please unmount the filesystem and rectify the problem(s). 
> > [51935.482550] vda3: writeback error on inode 25320400, offset 2097152, sector 58684120 
> > 
> > My assumption was that the fs being shut down caused some VFS operations
> > to start returning errors (including getattr) and that is why
> > fh_pre_saved ultimately didn't get set.
> 
> I'm wondering if the operation should just fail in this case
> rather than return a cobbled-up changeinfo4. Maybe for another
> day.
> 

Actually, this doesn't look too hard to do. We should be able to just
unwind and return an error in all cases if collecting pre_op_attrs
fails.

The trickier bit is what to do if collecting post_op_attrs fails after
collecting pre-op attrs and the operation itself succeeded. What should
go into the after_change value? 0? Should we just copy the before_change
value?

-- 
Jeff Layton <jlayton@...nel.org>