linux-kernel - Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C740BEF.2010806@kernel.org>
Date:	Tue, 24 Aug 2010 20:14:07 +0200
From:	Tejun Heo <tj@...nel.org>
To:	Mike Snitzer <snitzer@...hat.com>
CC:	Kiyoshi Ueda <k-ueda@...jp.nec.com>,
	Hannes Reinecke <hare@...e.de>, tytso@....edu,
	linux-scsi@...r.kernel.org, jaxboe@...ionio.com, jack@...e.cz,
	linux-kernel@...r.kernel.org, swhiteho@...hat.com,
	linux-raid@...r.kernel.org, linux-ide@...r.kernel.org,
	James.Bottomley@...e.de, konishi.ryusuke@....ntt.co.jp,
	linux-fsdevel@...r.kernel.org, vst@...b.net, rwheeler@...hat.com,
	Christoph Hellwig <hch@....de>, chris.mason@...cle.com,
	dm-devel@...hat.com
Subject: Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with
 sequenced flush

Hello,

On 08/24/2010 07:52 PM, Mike Snitzer wrote:
>> If it can't be done quickly enough the retry logic can be kept around
>> to keep the old behavior but that already was a broken behavior, so...
>> :-(
> 
> I'll have to review this thread again to understand why mpath's existing
> retry logic is broken behavior.  mpath is used with more capable SCSI
> devices so I'm missing why a failed FLUSH implies data loss.

SBC doesn't specify the failure behavior, so it could be that retrying
flush could be safe.  But for most disk type devices, flush failure
usually indicates that the device exhausted all the options to commit
some of pending data to NV media - ie. even remapping failed for
whatever reason.  Even if retry is safe, it's more likely to simply
delay notification of failure.

In ATA, the situation is clearer, when a device actively fails a
flush, the drive reports the first failed sector it failed to commit
and the next flush will continue _after_ the sector - IOW, data is
already lost.

<speculation>
I think there's no reason mpath should be tasked with retrying flush
failure.  That's upto the SCSI EH.  If the command failed in 'safe'
transient way - ie. device busy or whatnot, SCSI EH can and does retry
the command.  There are several FAILFAST bits already and SCSI EH can
avoid retrying transport errors for mpath (maybe it already does
that?) and just need to be able to tell upper layer that the failure
was a fast one and upper layer is responsible for retrying?  Is there
any reason to pass the whole sense information upwards?
</speculation>

Anyways, flush failure is different from read/write failures.
Read/writes can always be retried cleanly.  They are stateless.  I
don't know how SCSI devices would actually behavior but it's a bit
scary to retry SYNCHRONIZE_CACHE a device failed and report success
upwards.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/