lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 4 Oct 2013 11:12:34 -0700
From:	Eric Seppanen <eric@...estorage.com>
To:	emilne@...hat.com
Cc:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	KY Srinivasan <kys@...rosoft.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
	"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>
Subject: Re: Drivers: scsi: FLUSH timeout

On Fri, Oct 4, 2013 at 5:18 AM, Ewan Milne <emilne@...hat.com> wrote:
> On Thu, 2013-10-03 at 13:48 -0700, Eric Seppanen wrote:
>> Do I/O timeouts and flush timeouts need to be independently adjusted?
>> If you're having trouble with slow operations, it seems likely to be
>> across the board.
>>
>> Flush timeout could be defined as 2x the read/write timeout.  Any
>> other command-specific timeouts could be scaled the same way.
>
> It seems to me that there isn't any reason to expect that the maximum
> amount of time a device might take to perform various operations are
> related by any coefficient.  And, an HBA (particularly iSCSI or FC)
> could very well have different device types connected at different
> target IDs.  So I think the flush timeout should be adjustable on
> a per-device basis.  It's probably related more to the cache size
> on the device than anything else...

There are two possible delays: how long the device might possibly
take, and how long the storage fabric might take.

On a local device, only the first matters.  But there are environments
where the second dominates (e.g. a virtual machine, where the
hypervisor's storage uses multipath with a long failover delay).

If somebody wants to set flush timeouts > 60 seconds, I would like to
know if they're trying to address a slow device or a slow fabric.  If
it's the fabric, then it's kind of silly to make them set three
different timeouts to address the same problem.

An alternate way of handling long fabric delays would be to have a
fabric_timeout that gets added to all the other timeouts... could be a
scsi_host parameter but that's probably overengineering the problem.

There are already VM vendors that tell customers to adjust the current
sysfs timeout, so the least amount of work would be to make all of the
other timeouts track that one in some way (additive or
multiplicative).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ