lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 16 Nov 2014 19:30:59 +0100
From:	Barto <mister.freeman@...oste.net>
To:	Christoph Hellwig <hch@...radead.org>
CC:	"Elliott, Robert (Server Storage)" <Elliott@...com>,
	Guenter Roeck <linux@...ck-us.net>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
	Joe Perches <joe@...ches.com>
Subject: Re: BUG in scsi_lib.c due to a bad commit

Hello everyone,

> Also, SCSI_QUEUE_DELAY seems like an arbitrary magic number;
> maybe that value isn't working correctly anymore?

this is an excellent remark from Robert Elliot,

this gives me an idea : try to set manually a value in the if()
statement ( line 1779 in file /drivers/scsi/scsi_lib.c )

by default the value of SCSI_QUEUE_DELAY is 3 ms, which might be
inapropriate with some slow harddisks and with the changes made by the
commit
74665016086615bbaa3fa6f83af410a0a4e029ee ( scsi: convert host_busy to
atomic_t ),

after further tests I discover that the value 40 ms solves my problem,
the bug is gone with this value,

here is the patch who sets 40 ms in the if() statement :

--- a/drivers/scsi/scsi_lib.c	2014-10-05 21:23:04.000000000 +0200
+++ b/drivers/scsi/scsi_lib.c	2014-11-16 17:39:16.819674725 +0100
@@ -1776,7 +1776,7 @@ static void scsi_request_fn(struct reque
 	atomic_dec(&sdev->device_busy);
 out_delay:
 	if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev))
-		blk_delay_queue(q, SCSI_QUEUE_DELAY);
+		blk_delay_queue(q, 40);
 }

 static inline int prep_to_mq(int ret)

with this patch the value of SCSI_QUEUE_DELAY is still 3 ms, but here we
use 40 ms only in a specific part of scsi_lib.c file ( line 1779, it's
this part where the bug seems to be triggered, so that's why I set 40 ms
here in the blk_delay_queue() function )

after applying this patch I don't see problems related to I/O
performance/speed, all is ok,

the question is now : why putting a higher value in line 1779 does solve
the bug ?
and why before the commit 74665016086615bbaa3fa6f83af410a0a4e029ee I
don't have problems even with a value of 3 ms for SCSI_QUEUE_DELAY ?



Le 14/11/2014 08:32, Christoph Hellwig a écrit :
> On Thu, Nov 13, 2014 at 11:55:38PM +0100, Barto wrote:
>> it's interesting, with this commit
>> 74665016086615bbaa3fa6f83af410a0a4e029ee I have the bug :
>>
>> scsi: convert host_busy to atomic_t :
> 
> At this point we'll need a bisction between v3.16 as the last good
> point, and 74665016086615bbaa3fa6f83af410a0a4e029ee as the known bad
> point.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ