[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070118021306.GA22842@kernel.dk>
Date: Thu, 18 Jan 2007 13:13:06 +1100
From: Jens Axboe <jens.axboe@...cle.com>
To: linux-kernel@...r.kernel.org, KudOS <kudos@...ts.ucla.edu>
Subject: Re: block_device usage and incorrect block writes
On Wed, Jan 17 2007, Chris Frost wrote:
> We are working on a kernel module which uses the linux block device
> interface as part of a larger project, are seeing unexpected block
> write behavior from our usage of the noop scheduler, and were
> wondering whether anyone might have feedback on what the behavior we
> see?
>
> We would like to send block writes such that they are written to the
> drive controller in fifo order, so we are using the noop scheduler.
> However, a small percentage (1-5 of ~50,000) of block writes end up
> with incorrect data on the disk. We have determined that for each of
> these incorrect blocks, the last write for the block was issued while
> a previous write to the block was still queued (that is, the bio end
> function had not yet been called) and that the next to last issued
> write (that is, the generic_make_request function call) for the block
> contains the data that ends up on the disk.
noop doesn't guarentee that IO will be queued with the device in the
order in which they are submitted, and it definitely doesn't guarentee
that the device will process them in the order in which they are
dispatched. noop being FIFO basically means that it will not sort
requests. You can still have reordering if one request gets merged with
another, for instance.
The block layer in general provides no guarentees about ordering of
requests, unless you use barriers. So if you require ordering across a
given write request, it needs to be a write barrier.
> A possibly related (and unexpected) behavior we have noticed is that
> the bio end function is not always called in the same order as our
> calls to generic_make_request(). We are not sure whether this
> indicates that the requests are being written to disk in the callback
> order, but would like to fix this if so (since we want the writes made
> in the order of our requests).
The drive could complete requests in any order it sees fit, within the
depth level of the drive. If write caching is enabled, it can reorder
writes easily.
--
Jens Axboe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists