linux-kernel - Re: 4.15.14 crash with iscsi target and dvd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <595a10cfb387e6b2ab4d2053b84fed9b3da9e079.camel@wdc.com>
Date:   Tue, 3 Apr 2018 17:03:24 +0000
From:   Bart Van Assche <Bart.VanAssche@....com>
To:     "wakko@...mx.eu.org" <wakko@...mx.eu.org>
CC:     "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "richard.weinberger@...il.com" <richard.weinberger@...il.com>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>
Subject: Re: 4.15.14 crash with iscsi target and dvd

On Sun, 2018-04-01 at 14:27 -0400, Wakko Warner wrote:
> Wakko Warner wrote:
> > Wakko Warner wrote:
> > > I tested 4.14.32 last night with the same oops.  4.9.91 works fine.
> > > From the initiator, if I do cat /dev/sr1 > /dev/null it works.  If I mount
> > > /dev/sr1 and then do find -type f | xargs cat > /dev/null the target
> > > crashes.  I'm using the builtin iscsi target with pscsi.  I can burn from
> > > the initiator with out problems.  I'll test other kernels between 4.9 and
> > > 4.14.
> > 
> > So I've tested 4.x.y where x one of 10 11 12 14 15 and y is the latest patch
> > (except for 4.15 which was 1 behind)
> > Each of these kernels crash within seconds or immediate of doing find -type
> > f | xargs cat > /dev/null from the initiator.
> 
> I tried 4.10.0.  It doesn't completely lockup the system, but the device
> that was used hangs.  So from the initiator, it's /dev/sr1 and from the
> target it's /dev/sr0.  Attempting to read /dev/sr0 after the oops causes the
> process to hang in D state.

Hello Wakko,

Thank you for having narrowed down this further. I think that you encountered
a regression either in the block layer core or in the SCSI core. Unfortunately
the number of changes between kernel versions v4.9 and v4.10 in these two
subsystems is huge. I see two possible ways forward:
- Either that you perform a bisect to identify the patch that introduced this
  regression. However, I'm not sure whether you are familiar with the bisect
  process.
- Or that you identify the command that triggers this crash such that others
  can reproduce this issue without needing access to your setup.

How about reproducing this crash with the below patch applied on top of
kernel v4.15.x? The additional output sent by this patch to the system log
should allow us to reproduce this issue by submitting the same SCSI command
with sg_raw.

Thanks,

Bart.


Subject: [PATCH] Report commands with no physical segments in the system log

---
 drivers/scsi/scsi_lib.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6b6a6705f6e5..74a39db57d49 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1093,8 +1093,10 @@ int scsi_init_io(struct scsi_cmnd *cmd)
 	bool is_mq = (rq->mq_ctx != NULL);
 	int error = BLKPREP_KILL;
 
-	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq)))
+	if (WARN_ON_ONCE(!blk_rq_nr_phys_segments(rq))) {
+		scsi_print_command(cmd);
 		goto err_exit;
+	}
 
 	error = scsi_init_sgtable(rq, &cmd->sdb);
 	if (error)