[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5609BF00.5000502@codeaurora.org>
Date: Mon, 28 Sep 2015 15:28:16 -0700
From: Nikhilesh Reddy <reddyn@...eaurora.org>
To: Theodore Ts'o <tytso@....edu>
CC: linux-ext4@...r.kernel.org
Subject: Re: Using Cache barriers in lieu of REQ_FLUSH | REQ_FUA for emmc
5.1 (jdec spec JESD84-B51)
On Sat 19 Sep 2015 08:42:48 PM PDT, Theodore Ts'o wrote:
> On Tue, Sep 15, 2015 at 04:17:46PM -0700, Nikhilesh Reddy wrote:
>>
>> The eMMC 5.1 spec defines cache "barrier" capability of the eMMC device as
>> defined in JESD84-B51
>>
>> I was wondering if there were any downsides to replacing the
>> WRITE_FLUSH_FUA with the cache barrier?
>>
>> I understand that REQ_FLUSH is used to ensure that the current cache be
>> flushed to prevent any reordering but I dont seem to be clear on why
>> REQ_FUA is used.
>> Can someone please help me understand this part?
>>
>> I know there there was a big decision in 2010
>> https://lwn.net/Articles/400541/
>> and http://lwn.net/Articles/399148/
>> to remove the software based barrier support... but with the hardware
>> supporting "barriers" is there a downside to using them to replace the
>> flushes?
>
> OK, so a couple of things here.
>
> There is queuing happening at two different layers in the system;
> once at the block device layer, and one at the storage device layer.
> (Possibly more if you have a hardware RAID card, etc., but for this
> discussion, what's important is the queuing which is happening inside
> the kernel, and that which is happening below the kernel.
>
> The transition in 2010 is referring to how we handle barriers at the
> block device layer, and was inspired by the fact that at that time,
> the vast majority of the storage devices only supported "cache flush"
> at the storage layer, and a few devices would support FUA (Force Unit
> Attention) requests. But it can support devices which have a true
> cache barrier function.
>
> So when we say REQ_FLUSH, what we mean is that the writes are flushed
> from the block layer command queues to the storage device, and that
> subsequent writes will not be reordered before the flush. Since most
> devices don't support a cache barrier command, this is implemented in
> practice as a FLUSH CACHE, but if the device supports cache barrier
> command, that would be sufficient.
>
> The FUA write command is the command that actually has temporal
> meaning; the device is not supported to signal completion until that
> particular write has been committed to stable store. And if you
> combine that with a flush command, as in WRITE_FLUSH_FUA, then that
> implies a cache barrier, followed by a write that should not return
> until write (FUA), and all preceeding writes, have been committed to
> stable store (implied by the cache barrier).
>
> For devices that support a cache barrier, a REQ_FLUSH can be
> implemented using a cache barrier. If the storage device does not
> support a cache barrier, the much stronger FLUSH CACHE command will
> also work, and in practice, that's what gets used in for most storage
> devices today.
>
> For devices that don't support a FUA write, this can be simulated
> using the (overly strong) combination of a write followed by a FLUSH
> CACHE command. (Note, due to regressions caused by buggy hardware,
> the libata driver does not enable FUA by default. Interestingly,
> apparently Windows 2012 and newer no longer tries to use FUA either;
> maybe Microsoft has run into consumer-grade storage devices with
> crappy firmware? That being said, if you are using SATA drives which
> in a JBOD which is has a SAS expander, you *are* using FUA --- but
> presumably people who are doing this are at bigger shops who can do
> proper HDD validation and can lean on their storage vendors to make
> sure any firmware bugs they find get fixed.)
>
> So for ext4, when we do a journal commit, first we write the journal
> blocks, then a REQ_FLUSH, and then we FUA write the commit block ---
> which for commodity SATA drives, gets translated to write the journal
> blocks, FLUSH CACHE, write the commit block, FLUSH CACHE.
>
> If your storage device has support for a barrier command and FUA, then
> this could also be translated to write the journal blocks, CACHE
> BARRIER, FUA WRITE the commit block.
>
> And of course if you don't have FUA support, but you do have the
> barrier command, then this could also get translated to write the
> journal blocks, CACHE BARRIER, write the commit block, FLUSH CACHE.
>
> All of these scenarios should work just fine.
>
> Hope this helps,
>
> - Ted
Thanks so much !!
This was really helpful!
--
Thanks
Nikhilesh Reddy
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum,
a Linux Foundation Collaborative Project.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists