lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR21MB159382417618D56A90F70E5BD7379@MWHPR21MB1593.namprd21.prod.outlook.com>
Date:   Tue, 8 Jun 2021 14:48:44 +0000
From:   Michael Kelley <mikelley@...rosoft.com>
To:     Long Li <longli@...rosoft.com>, KY Srinivasan <kys@...rosoft.com>,
        "martin.petersen@...cle.com" <martin.petersen@...cle.com>,
        "wei.liu@...nel.org" <wei.liu@...nel.org>,
        "jejb@...ux.ibm.com" <jejb@...ux.ibm.com>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>
Subject: RE: [PATCH 3/3] scsi: storvsc: Correctly handle multiple flags in
 srb_status

From: Long Li <longli@...rosoft.com> Sent: Monday, June 7, 2021 3:25 PM
> >
> > Hyper-V is observed to sometimes set multiple flags in the srb_status, such
> > as ABORTED and ERROR. Current code in
> > storvsc_handle_error() handles only a single flag being set, and does nothing
> > when multiple flags are set.  Fix this by changing the case statement into a
> > series of "if" statements testing individual flags. The functionality for handling
> > each flag is unchanged.
> >
> > Signed-off-by: Michael Kelley <mikelley@...rosoft.com>
> > ---
> >  drivers/scsi/storvsc_drv.c | 61 +++++++++++++++++++++++++----------------
> > -----
> >  1 file changed, 33 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index
> > fff9441..e96d2aa 100644
> > --- a/drivers/scsi/storvsc_drv.c
> > +++ b/drivers/scsi/storvsc_drv.c
> > @@ -1009,17 +1009,40 @@ static void storvsc_handle_error(struct
> > vmscsi_request *vm_srb,
> >  	struct storvsc_scan_work *wrk;
> >  	void (*process_err_fn)(struct work_struct *work);
> >  	struct hv_host_device *host_dev = shost_priv(host);
> > -	bool do_work = false;
> >
> > -	switch (SRB_STATUS(vm_srb->srb_status)) {
> > -	case SRB_STATUS_ERROR:
> > +	/*
> > +	 * In some situations, Hyper-V sets multiple bits in the
> > +	 * srb_status, such as ABORTED and ERROR. So process them
> > +	 * individually, with the most specific bits first.
> > +	 */
> > +
> > +	if (vm_srb->srb_status & SRB_STATUS_INVALID_LUN) {
> > +		set_host_byte(scmnd, DID_NO_CONNECT);
> > +		process_err_fn = storvsc_remove_lun;
> > +		goto do_work;
> > +	}
> > +
> > +	if (vm_srb->srb_status & SRB_STATUS_ABORTED) {
> > +		if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID &&
> > +		    /* Capacity data has changed */
> > +		    (asc == 0x2a) && (ascq == 0x9)) {
> > +			process_err_fn = storvsc_device_scan;
> > +			/*
> > +			 * Retry the I/O that triggered this.
> > +			 */
> > +			set_host_byte(scmnd, DID_REQUEUE);
> > +			goto do_work;
> > +		}
> > +	}
> > +
> > +	if (vm_srb->srb_status & SRB_STATUS_ERROR) {
> 
> I'm curious on SRB_STATUS_INVALID_LUN and SRB_STATUS_ABORTED, the actions on
> those two codes are different.
> 
> It doesn't look like we can get those two in the same status code.

Agreed.  I've only observed having ERROR and ABORTED in the same status code.
That happens when a bad PFN is passed on a I/O request, which can be easily
tested.  With the old code, having ERROR and ABORTED together resulted in doing
nothing.

The behavior of INVALID_LUN is unchanged by this patch.  But with this patch,
if INVALID_LUN and ABORTED should ever occur together, we would process the
INVALID_LUN error, and ignore ABORTED, which I think makes sense. 
INVALID_LUN is the most serious error, so we process it first.  Then ABORTED is
next in priority for the specific case of "Capacity data has changed".  Finally we
handle ERROR.

Michael

> 
> >  		/*
> >  		 * Let upper layer deal with error when
> >  		 * sense message is present.
> >  		 */
> > -
> >  		if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID)
> > -			break;
> > +			return;
> > +
> >  		/*
> >  		 * If there is an error; offline the device since all
> >  		 * error recovery strategies would have already been @@ -
> > 1032,37 +1055,19 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb,
> >  			set_host_byte(scmnd, DID_PASSTHROUGH);
> >  			break;
> >  		/*
> > -		 * On Some Windows hosts TEST_UNIT_READY command can
> > return
> > -		 * SRB_STATUS_ERROR, let the upper level code deal with it
> > -		 * based on the sense information.
> > +		 * On some Hyper-V hosts TEST_UNIT_READY command can
> > +		 * return SRB_STATUS_ERROR. Let the upper level code
> > +		 * deal with it based on the sense information.
> >  		 */
> >  		case TEST_UNIT_READY:
> >  			break;
> >  		default:
> >  			set_host_byte(scmnd, DID_ERROR);
> >  		}
> > -		break;
> > -	case SRB_STATUS_INVALID_LUN:
> > -		set_host_byte(scmnd, DID_NO_CONNECT);
> > -		do_work = true;
> > -		process_err_fn = storvsc_remove_lun;
> > -		break;
> > -	case SRB_STATUS_ABORTED:
> > -		if (vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID &&
> > -		    (asc == 0x2a) && (ascq == 0x9)) {
> > -			do_work = true;
> > -			process_err_fn = storvsc_device_scan;
> > -			/*
> > -			 * Retry the I/O that triggered this.
> > -			 */
> > -			set_host_byte(scmnd, DID_REQUEUE);
> > -		}
> > -		break;
> >  	}
> > +	return;
> >
> > -	if (!do_work)
> > -		return;
> > -
> > +do_work:
> >  	/*
> >  	 * We need to schedule work to process this error; schedule it.
> >  	 */
> > --
> > 1.8.3.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ