lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 8 Aug 2020 08:59:09 +0200
From:   Martin Kepplinger <martin.kepplinger@...i.sm>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     James Bottomley <James.Bottomley@...senpartnership.com>,
        Bart Van Assche <bvanassche@....org>,
        Can Guo <cang@...eaurora.org>, martin.petersen@...cle.com,
        linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel@...i.sm
Subject: Re: [PATCH] scsi: sd: add runtime pm to open / release

On 07.08.20 16:30, Alan Stern wrote:
> On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote:
>> it's really strange: below is the change I'm trying. Of course that's
>> only for testing the functionality, nothing how a patch could look like.
>>
>> While I remember it had worked, now (weirdly since I tried that mounting
>> via fstab) it doesn't anymore!
>>
>> What I understand (not much): I handle the error with "retry" via the
>> new flag, but scsi_decide_disposition() returns SUCCESS because of "no
>> more retries"; but it's the first and only time it's called.
> 
> Are you saying that scmd->allowed is set to 0?  Or is scsi_notry_cmd() 
> returning a nonzero value?  Whichever is true, why does it happen that 
> way?

scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed).

why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if
(status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true,
then at the end it returns 1 because of REQ_FAILFAST_DEV.

that seems to come from the block layer. why and when? could I change
that so that the scsi error handling stays in control?

> 
> What is the failing command?  Is it a READ(10)?

Not sure how I'd answer that, but here's the test to trigger the error:

mount /dev/sda1 /mnt
cd /mnt
ls
cp file ~/ (if ls "works" and doesn't yet trigger the error)

and that's the (familiar looking) logs when doing so. again: despite the
mentioned workaround in scsi_error and the new expected_media_change
flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems
to override what I want to do is scsi_error:

[   55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
hostbyte=0x00 driverbyte=0x08 cmd_age=0s
[   55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
[   55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
[   55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc
e0 00 00 01 00
[   55.557666] blk_update_request: I/O error, dev sda, sector 589024 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   55.574691] blk_update_request: I/O error, dev sda, sector 589025 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed
[   55.591562] blk_update_request: I/O error, dev sda, sector 589026 op
0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed
(... goes on with the same)


> 
>> How can this be? What am I missing?
> 
> It's kind of hard to tell without seeing the error messages or system 
> log or any debugging information.

thanks a lot for getting back to me,

                           martin


> 
> Alan Stern
> 
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
>>  				return NEEDS_RETRY;
>>  			}
>>  		}
>> +		if (scmd->device->expecting_media_change) {
>> +			if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) {
>> +				scmd->device->expecting_media_change = 0;
>> +				return NEEDS_RETRY;
>> +			}
>> +		}
>> +
>>  		/*
>>  		 * we might also expect a cc/ua if another LUN on the target
>>  		 * reported a UA with an ASC/ASCQ of 3F 0E -
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d90fefffe31b..bb583e403b81 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
>>  	if (!sdkp)	/* E.g.: runtime resume at the start of sd_probe() */
>>  		return 0;
>>
>> +	sdkp->device->expecting_media_change = 1;
>> +
>>  	if (!sdkp->device->manage_start_stop)
>>  		return 0;
>>
>> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
>> index bc5909033d13..f5fc1af68e00 100644
>> --- a/include/scsi/scsi_device.h
>> +++ b/include/scsi/scsi_device.h
>> @@ -169,6 +169,7 @@ struct scsi_device {
>>  				 * this device */
>>  	unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
>>  				     * because we did a bus reset. */
>> +	unsigned expecting_media_change:1;
>>  	unsigned use_10_for_rw:1; /* first try 10-byte read / write */
>>  	unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
>>  	unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */

Powered by blists - more mailing lists