[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C6B7F4A.2040807@kernel.org>
Date: Wed, 18 Aug 2010 08:35:54 +0200
From: Tejun Heo <tj@...nel.org>
To: Christoph Hellwig <hch@....de>
CC: jaxboe@...ionio.com, linux-fsdevel@...r.kernel.org,
linux-scsi@...r.kernel.org, linux-ide@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
James.Bottomley@...e.de, tytso@....edu, chris.mason@...cle.com,
swhiteho@...hat.com, konishi.ryusuke@....ntt.co.jp,
dm-devel@...hat.com, vst@...b.net, jack@...e.cz,
rwheeler@...hat.com, hare@...e.de
Subject: Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with
sequenced flush
Hello,
On 08/17/2010 06:59 PM, Christoph Hellwig wrote:
> I think we really need all the conversions in one tree, block layer,
> remapping drivers and filesystems.
I don't know. If filesystem changes are really trivial maybe, but
md/dm changes seem a bit too invasive to go through the block tree.
> Btw, I've done the conversion for all filesystems and I'm running tests
> over them now. Expect the series late today or tomorrow.
Cool. :-)
>> I might just resequence it to finish this part of discussion but what
>> does that really buy us? It's not really gonna help bisection.
>> Bisection won't be able to tell anything in higher resolution than
>> "the new implementation doesn't work". If you show me how it would
>> actually help, I'll happily reshuffle the patches.
>
> It's not bisecting to find bugs in the barrier conversion. We can't
> easily bisect it down anyway. The problem is when we try to bisect
> other problems and get into the middle of the series barriers suddenly
> are gone. Which is not very helpful for things like data integrity
> problems in filesystems.
Ah, okay, hmmm.... alright, I'll resequence the patches. If the
filesystem changes can be put into a single tree somehow, we can keep
things mostly working at least for direct devices.
>> IIUC, when any of flushes get DM_ENDIO_REQUEUE (which tells the dm
>> core layer to retry the whole bio later), it trumps all other failures
>> and the bio is retried later. That was why DM_ENDIO_REQUEUE was
>> prioritized over other error codes, which actually is sort of
>> incorrect in that once a FLUSH fails, it _MUST_ be reported to upper
>> layers as FLUSH failure implies data already lost. So,
>> DM_ENDIO_REQUEUE actually should have lower priority than other
>> failures. But, then again, the error codes still need to be
>> prioritized.
>
> I think that's something we better leave to the DM team.
Sure, but we shouldn't be ripping out the code to do that.
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists