lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 21 Jul 2022 11:14:55 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     Geert Uytterhoeven <geert@...ux-m68k.org>
Cc:     "Martin K . Petersen" <martin.petersen@...cle.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        scsi <linux-scsi@...r.kernel.org>,
        Ming Lei <ming.lei@...hat.com>, Hannes Reinecke <hare@...e.de>,
        John Garry <john.garry@...wei.com>, ericspero@...oud.com,
        jason600.groome@...il.com,
        Linux-Renesas <linux-renesas-soc@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support

On 7/21/22 01:07, Geert Uytterhoeven wrote:
> On Wed, Jul 20, 2022 at 8:04 PM Bart Van Assche <bvanassche@....org> wrote:
>> That's surprising. Is there anything unusual about the test setup that I
>> should know, e.g. very small number of CPU cores or a very small queue
>> depth of the SATA device? How about adding pr_info() statements at the
>> start and end of the following functions and also before the return
>> statements in these functions to determine where execution of the START
>> command hangs?
>> * sd_start_done().
>> * sd_start_done_work().
> 
> None of these functions seem to be called at all?
That's weird. This means that either sd_submit_start() hangs or that the 
execution of the START command never finishes. The latter is unlikely 
since the SCSI error handler is assumed to abort commands that hang. It 
would also be weird if sd_submit_start() would hang before the START 
command is submitted since the code flow for submitting the START 
command is very similar to the code flow for submitting the START 
command without patch "scsi: sd: Rework asynchronous resume support" 
(calling scsi_execute()).

What is also weird is that there are at least two SATA setups on which 
this code works fine, including my Qemu setup.

Although it is possible to enable tracing at boot time, adding the 
following parameters to the kernel command line would generate too much 
logging data:

tp_printk 
trace_event=block_rq_complete,block_rq_error,block_rq_insert,block_rq_issue,block_rq_merge,block_rq_remap,block_rq_requeue,scsi_dispatch_cmd_done,scsi_dispatch_cmd_start,scsi_eh_wakeup,scsi_dispatch_cmd_error,scsi_dispatch_cmd_timeout 
scsi_mod.scsi_logging_level=32256

I'm not sure what the best way is to proceed since I cannot reproduce 
this issue.

Bart.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ