linux-kernel - Re: [PATCH 2/2] nfs/blocklayout: Limit repeat device registration on failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Zz9SiWqFBPL6RdJt@tissot.1015granger.net>
Date: Thu, 21 Nov 2024 10:32:25 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: Benjamin Coddington <bcodding@...hat.com>
Cc: Trond Myklebust <trondmy@...nel.org>, Anna Schumaker <anna@...nel.org>,
        linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        Christoph Hellwig <hch@....de>
Subject: Re: [PATCH 2/2] nfs/blocklayout: Limit repeat device registration on
 failure

On Thu, Nov 21, 2024 at 10:11:33AM -0500, Benjamin Coddington wrote:
> On 20 Nov 2024, at 10:22, Chuck Lever wrote:
> 
> > On Wed, Nov 20, 2024 at 09:09:35AM -0500, Benjamin Coddington wrote:
> >> If we're unable to register a SCSI device, ensure we mark the device as
> >> unavailable so that it will timeout and be re-added via GETDEVINFO.  This
> >> avoids repeated doomed attempts to register a device in the IO path.
> >>
> >> Add some clarifying comments as well.
> >>
> >> Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration")
> >> Signed-off-by: Benjamin Coddington <bcodding@...hat.com>
> >> Reviewed-by: Christoph Hellwig <hch@....de>
> >> ---
> >>  fs/nfs/blocklayout/blocklayout.c | 12 +++++++++++-
> >>  1 file changed, 11 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
> >> index 0becdec12970..b36bc2f4f7e2 100644
> >> --- a/fs/nfs/blocklayout/blocklayout.c
> >> +++ b/fs/nfs/blocklayout/blocklayout.c
> >> @@ -571,19 +571,29 @@ bl_find_get_deviceid(struct nfs_server *server,
> >>  	if (!node)
> >>  		return ERR_PTR(-ENODEV);
> >>
> >> +	/*
> >> +	 * Devices that are marked unavailable are left in the cache with a
> >> +	 * timeout to avoid sending GETDEVINFO after every LAYOUTGET, or
> >> +	 * constantly attempting to register the device.  Once marked as
> >> +	 * unavailable they must be deleted and never reused.
> >> +	 */
> >>  	if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) {
> >>  		unsigned long end = jiffies;
> >>  		unsigned long start = end - PNFS_DEVICE_RETRY_TIMEOUT;
> >>
> >>  		if (!time_in_range(node->timestamp_unavailable, start, end)) {
> >> +			/* Force a new GETDEVINFO for this LAYOUT */
> >
> > Or perhaps: "Uncork subsequent GETDEVINFO operations for this device"
> > <shrug>
> 
> Sure, ok!
> 
> >>  			nfs4_delete_deviceid(node->ld, node->nfs_client, id);
> >>  			goto retry;
> >>  		}
> >>  		goto out_put;
> >>  	}
> >>
> >> -	if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node)))
> >> +	/* If we cannot register, treat this device as transient */
> >
> > How about "Make a negative cache entry for this device"
> 
> Hmm - that's closer to the dentry language rather than how we refer to
> temporary error cases in device land.  For me the "transient" has some
> hopeful meaning as in we expect this might work in the future - but I'm ok
> changing this comment.  There will be some NFS clients that might try to do
> pNFS SCSI but will never actually have the devices locally, and so that's
> not a "transient" situation.  This can only fixed today with export policy.

No big deal!


> >> +	if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) {
> >> +		nfs4_mark_deviceid_unavailable(node);
> >>  		goto out_put;
> >> +	}
> >>
> >>  	return node;
> >>
> >> -- 
> >> 2.47.0
> >>
> >
> > It took me a bit to understand what this patch does. It is like
> > setting up a negative dentry so the local device cache absorbs
> > bursts of checks for the device. OK.
> 
> Yes, its like the layout error handling, but for devices.
> 
> Its not obvious at this layer, but every IO wants to do LAYOUTGET, then
> figure out which device GETDEVINFO, then here we need to prep the device
> with a reservation.  Its a lot of slow work that makes a mess of IO
> latencies if one of the later steps is going to fail for awhile.

Thanks! It would be nice to add this bit of context to the patch
description.


> > Just an observation: Negative caching has some consequences too.
> > For instance, there will now be a period where, if the device
> > happens to become available, the layout is still unusable. I wonder
> > if that's going to have some undesirable operational effects.
> 
> It sure does, but I don't think there's a simple way to get notified that a
> SCSI device has re-appeared or has started supporting persistent
> reservations.

Reviewed-by: Chuck Lever <chuck.lever@...cle.com>


-- 
Chuck Lever