linux-kernel - Re: [PATCH] scsi: sd: add runtime pm to open / release

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6f0c530f-4309-ab1e-393b-83bf8367f59e@puri.sm>
Date:   Tue, 11 Aug 2020 09:55:54 +0200
From:   Martin Kepplinger <martin.kepplinger@...i.sm>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     James Bottomley <James.Bottomley@...senpartnership.com>,
        Bart Van Assche <bvanassche@....org>,
        Can Guo <cang@...eaurora.org>, martin.petersen@...cle.com,
        linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel@...i.sm
Subject: Re: [PATCH] scsi: sd: add runtime pm to open / release

On 10.08.20 16:13, Alan Stern wrote:
> On Mon, Aug 10, 2020 at 02:03:17PM +0200, Martin Kepplinger wrote:
>> On 09.08.20 17:26, Alan Stern wrote:
>>> This is a somewhat fragile approach.  You don't know for certain that 
>>> scsi_noretry_cmd will be called.  Also, scsi_noretry_cmd can be called 
>>> from other places.
>>>
>>> It would be better to clear the expecting_media_change flag just before 
>>> returning from scsi_decide_disposition.  That way its use is localized 
>>> to one routine, not spread out between two.
>>>
>>> Alan Stern
>>>
>>
>> Hi Alan,
>>
>> maybe you're right. I initially just thought that I'd allow for specific
>> error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY,
>> ERROR) despite having the flag set.
>>
>> The below version works equally fine for me but I'm not sure if it's
>> actually more safe.
>>
>> James, when exposing a new writable sysfs option like
>> "suspend_no_media_change"(?) that drivers can check before setting the
>> new "expecting_media_change" flag (during resume), would this addition
>> make sense to you?
>>
>> thanks,
>>
>>                       martin
>>
>>
>>
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>  				return NEEDS_RETRY;
>>  			}
>>  		}
>> +		if (scmd->device->expecting_media_change) {
>> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
>> +				/*
>> +				 * clear the expecting_media_change in
>> +				 * scsi_decide_disposition() because we
>> +				 * need to catch possible "fail fast" overrides
>> +				 * that block readahead can cause.
>> +				 */
>> +				return NEEDS_RETRY;
>> +			}
>> +		}
>> +
>>  		/*
>>  		 * we might also expect a cc/ua if another LUN on the target
>>  		 * reported a UA with an ASC/ASCQ of 3F 0E -
>> @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
>>  	 * the request was not marked fast fail.  Note that above,
>>  	 * even if the request is marked fast fail, we still requeue
>>  	 * for queue congestion conditions (QUEUE_FULL or BUSY) */
>> -	if ((++scmd->retries) <= scmd->allowed
>> -	    && !scsi_noretry_cmd(scmd)) {
>> -		return NEEDS_RETRY;
>> +	if ((++scmd->retries) <= scmd->allowed) {
>> +		/*
>> +		 * but scsi_noretry_cmd() cannot override the
>> +		 * expecting_media_change flag.
>> +		 */
>> +		if (!scsi_noretry_cmd(scmd) ||
>> +		    scmd->device->expecting_media_change) {
>> +			scmd->device->expecting_media_change = 0;
>> +			return NEEDS_RETRY;
>> +		} else {
>> +			/* marked fast fail and not expected. */
>> +			return SUCCESS;
>> +		}
>>  	} else {
> 
> This may not matter...  but it's worth pointing out that 
> expecting_media_change doesn't get cleared if ++scmd->retries > 
> scmd->allowed.

absolutely worth pointing out and I'm not yet sure about that one.

> 
> It also doesn't get cleared in cases where the device _doesn't_ 
> report a Unit Attention.

true. but don't we set the flag for a future UA we don't yet know of? If
we would want to clear it outside of a UA, I think we'd need to keep
track of a suspend/resume cycle and if we see that we *had* successfully
"done requests" after resuming, we could clear it...

> 
> Alan Stern
>