[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090331171900.GU5178@kernel.dk>
Date: Tue, 31 Mar 2009 19:19:00 +0200
From: Jens Axboe <jens.axboe@...cle.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Ric Wheeler <rwheeler@...hat.com>,
Fernando Luis Vázquez Cao
<fernando@....ntt.co.jp>, Jeff Garzik <jeff@...zik.org>,
Christoph Hellwig <hch@...radead.org>,
Theodore Tso <tytso@....edu>, Ingo Molnar <mingo@...e.hu>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Arjan van de Ven <arjan@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Nick Piggin <npiggin@...e.de>, David Rees <drees76@...il.com>,
Jesper Krogh <jesper@...gh.cc>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
chris.mason@...cle.com, david@...morbit.com, tj@...nel.org
Subject: Re: [PATCH 1/7] block: Add block_flush_device()
On Tue, Mar 31 2009, Linus Torvalds wrote:
>
>
> On Tue, 31 Mar 2009, Jens Axboe wrote:
> >
> > So here's a test patch that attempts to just ignore such a failure to
> > flush the caches.
>
> I suspect you should not do it like this.
>
> > diff --git a/fs/bio.c b/fs/bio.c
> > index a040cde..79e3cec 100644
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -1380,7 +1380,17 @@ void bio_check_pages_dirty(struct bio *bio)
> > **/
> > void bio_endio(struct bio *bio, int error)
> > {
> > - if (error)
> > + /*
> > + * Special case here - hide the -EOPNOTSUPP from the driver or
> > + * block layer, dump a warning the first time this happens so that
> > + * the admin knows that we may not provide the ordering guarantees
> > + * that are needed. Don't clear the uptodate bit.
> > + */
> > + if (error == -EOPNOTSUPP && bio_barrier(bio)) {
> > + set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
> > + blk_queue_set_noflush(bio->bi_bdev);
> > + error = 0;
> > + } else if (error)
>
> I suspect this part is just wrong.
>
> I could easily imagine a driver that returns EOPNOTSUPP only for a certain
> _kind_ of bio.
>
> For example, if the drive doesn't support FUA, then you cannot do a
> serialized IO operation, but you can still mostly do a serialized op
> without any IO attached to it.
FUA we should be able to reliably detect, it's really the cache flush
operation itself that has caused headaches in the past. The -EOPNOTSUPP
really comes from the block layer, not from the device driver. That's
mainly due to the fact that we only send down the actual barrier, if the
driver already said it supported them. If they do fail them, we probably
need to pick up the -EIO bits and pieces and pretend it didn't happen as
well. So it definitely needs more looking into, auditing, and testing.
I'll do that tomorrow.
> IOW, the "empty flush" really _is_ special. An this check should not be in
> the generic "bio_endio()" case, it should only be in the special
> blkdev_issue_flush() case.
>
> I think. No?
The empty flush is special and it is easy to fix that by itself. That
should probably be the first patch in the series. But the retry logic
and such for actual write barriers are the majority of the problems
involved with supporting barriers, and those I want to get rid of.
I think it'll be more clear when I post a real patch series with the
individual steps outlined.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists