lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <823057f0-95cf-bfcf-c39f-ca5d7abe2372@puri.sm>
Date:   Tue, 30 Jun 2020 05:33:25 +0200
From:   Martin Kepplinger <martin.kepplinger@...i.sm>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     Bart Van Assche <bvanassche@....org>, jejb@...ux.ibm.com,
        Can Guo <cang@...eaurora.org>, martin.petersen@...cle.com,
        linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel@...i.sm
Subject: Re: [PATCH] scsi: sd: add runtime pm to open / release

On 29.06.20 18:15, Alan Stern wrote:
> On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote:
>>
>>
>> On 26.06.20 17:44, Alan Stern wrote:
>>> Martin's best approach would be to add some debugging code to find out why 
>>> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call 
>>> doesn't lead to pm_request_resume().
>>>
>>> Alan Stern
>>>
>>
>> Hi Alan,
>>
>> blk_queue_enter() always - especially when sd is runtime suspended and I
>> try to mount as above - sets success to be true for me, so never
>> continues down to bkl_pm_request_resume(). All I see is "PM: Removing
>> info for No Bus:sda1".
> 
> Aha.  Looking at this more closely, it's apparent that the code in 
> blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT 
> flag is set then the request can be issued regardless of the queue's 
> runtime status.  That is not correct when the queue is suspended.
> 
> Below is my attempt to fix this up.  I'm not sure that the patch is 
> entirely correct, but it should fix this logic bug.  I would appreciate a 
> critical review.
> 
> Martin, does this fix the problem?
> 

not quite: mounting works and resuming itself indeed happens now when
copying a file, but the I/O itself doesn't, but says "device offline or
changed":

[  167.167615] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
hostbyte=0x00 driverbyte=0x08 cmd_age=0s
[  167.167630] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
[  167.167638] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
[  167.167648] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 24
c2 00 00 01 00
[  167.167658] blk_update_request: I/O error, dev sda, sector 9410 op
0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  167.178327] FAT-fs (sda1): FAT read failed (blocknr 1218)
[  167.183895] sd 0:0:0:0: [sda] tag#0 device offline or changed
[  167.189695] blk_update_request: I/O error, dev sda, sector 5101888 op
0x0:(READ) flags 0x80700 phys_seg 8 prio class 0
[  167.200510] sd 0:0:0:0: [sda] tag#0 device offline or changed


and a later try to copy a file only yields (mostly my own debug prints):


[  371.110798] blk_queue_enter: wait_event: pm=0
[  371.300666] scsi_runtime_resume
[  371.303834] scsi_runtime_resume
[  371.307007] scsi_runtime_resume
[  371.310213] sd 0:0:0:0: [sda] tag#0 device offline or changed
[  371.316011] blk_update_request: I/O error, dev sda, sector 5101888 op
0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  372.560690] scsi_runtime_suspend
[  372.563968] scsi_runtime_suspend
[  372.567237] scsi_runtime_suspend

thanks Alan for taking the time and trying to fix this! you're close.
what is missing?

                                martin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ