[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49D1206E.7090809@garzik.org>
Date: Mon, 30 Mar 2009 15:41:34 -0400
From: Jeff Garzik <jeff@...zik.org>
To: Rik van Riel <riel@...hat.com>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Ric Wheeler <rwheeler@...hat.com>,
"Andreas T.Auer" <andreas.t.auer_lkml_73537@...us.ath.cx>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
Theodore Tso <tytso@....edu>, Mark Lord <lkml@....ca>,
Stefan Richter <stefanr@...6.in-berlin.de>,
Matthew Garrett <mjg59@...f.ucam.org>,
Andrew Morton <akpm@...ux-foundation.org>,
David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29
Rik van Riel wrote:
> Linus Torvalds wrote:
>> And my point is, IT MAKES SENSE to just do the elevator barrier,
>> _without_ the drive command.
>
> No argument there. I have seen NCQ starvation on SATA disks,
> with some requests sitting in the drive for seconds, while
> the drive was busy handling hundreds of requests/second
> elsewhere...
If certain requests are hanging out in the drive's wbcache longer than
others, that increases the probability that OS filesystem-required,
elevator-provided ordering becomes skewed once requests are passed to
drive firmware.
The sad, sucky fact is that NCQ starvation implies FLUSH CACHE is more
important than ever, if filesystems want to get ordering correct.
IDEALLY, according to the SATA protocol spec, we could issue up to 32
NCQ commands to a SATA drive, each marked with the "FUA" bit to force
the command to hit permanent media before returning.
In theory, this NCQ+FUA mode gives the drive maximum ability to optimize
parallel in-progress commands, decoupling command completion and command
issue -- while also giving the OS complete control of ordering by virtue
of emptying the SATA tagged command queue.
In practice, NCQ+FUA flat out did not work on early drives, and
performance was way under what you would expect for parallel write-thru
command execution. I haven't benchmarked NCQ+FUA in a few years; it
might be worth revisiting.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists