[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.11.1912052358380.11561@mx.ewheeler.net>
Date: Fri, 6 Dec 2019 00:04:02 +0000 (UTC)
From: Eric Wheeler <bcache@...ts.ewheeler.net>
To: Coly Li <colyli@...e.de>
cc: kungf <wings.wyang@...il.com>, kent.overstreet@...il.com,
linux-bcache@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bcache: add REQ_FUA to avoid data lost in writeback
mode
On Tue, 3 Dec 2019, Coly Li wrote:
> On 2019/12/3 3:34 上午, Eric Wheeler wrote:
> > On Mon, 2 Dec 2019, Coly Li wrote:
> >> On 2019/12/2 6:24 下午, kungf wrote:
> >>> data may lost when in the follow scene of writeback mode:
> >>> 1. client write data1 to bcache
> >>> 2. client fdatasync
> >>> 3. bcache flush cache set and backing device
> >>> if now data1 was not writed back to backing, it was only guaranteed safe in cache.
> >>> 4.then cache writeback data1 to backing with only REQ_OP_WRITE
> >>> So data1 was not guaranteed in non-volatile storage, it may lost if power interruption
> >>>
> >>
> >> Hi,
> >>
> >> Do you encounter such problem in real work load ? With bcache journal, I
> >> don't see the possibility of data lost with your description.
> >>
> >> Correct me if I am wrong.
> >>
> >> Coly Li
> >
> > If this does become necessary, then we should have a sysfs or superblock
> > flag to disable FUA for those with RAID BBUs.
>
> Hi Eric,
>
> I doubt it is necessary to add FUA tag for all writeback bios, it is
> unnecessary. If power failure happens after dirty data written to
> backing device and the bkey turns into clean, a following read request
> will go to cache device because the LBA can be indexed no matter it is
> dirty or clean. Unless the bkey is invalidated from the B+tree, read
> will always go to cache device firstly in writeback mode. If a power
> failure happens before the cached bkey turns from dirty to clean, just
> an extra writeback bio flushed from cache device to backing device with
> identical data. Comparing the FUA tag for all writeback bios (it will be
> really slow), the extra writeback IOs after a power failure is more
> acceptable to me.
I agree. FWIW, I just learned about /sys/block/sdX/queue/write_cache from
Nikos Tsironis <ntsironis@...ikto.com>. Thus, my flag request for a FUA
bypass isn't necessary anyway, even if you did want an FUA there, because
FUAs are stripped when a blockdev is set to "write back" (QUEUE_FLAG_WC).
----------------------------------------------------------------------
This happens in generic_make_request_checks():
/*
* Filter flush bio's early so that make_request based
* drivers without flush support don't have to worry
* about them.
*/
if (op_is_flush(bio->bi_opf) &&
!test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA);
if (!nr_sectors) {
status = BLK_STS_OK;
goto end_io;
}
}
----------------------------------------------------------------------
-Eric
>
> Coly Li
>
> >
> >>> Signed-off-by: kungf <wings.wyang@...il.com>
> >>> ---
> >>> drivers/md/bcache/writeback.c | 2 +-
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> >>> index 4a40f9eadeaf..e5cecb60569e 100644
> >>> --- a/drivers/md/bcache/writeback.c
> >>> +++ b/drivers/md/bcache/writeback.c
> >>> @@ -357,7 +357,7 @@ static void write_dirty(struct closure *cl)
> >>> */
> >>> if (KEY_DIRTY(&w->key)) {
> >>> dirty_init(w);
> >>> - bio_set_op_attrs(&io->bio, REQ_OP_WRITE, 0);
> >>> + bio_set_op_attrs(&io->bio, REQ_OP_WRITE | REQ_FUA, 0);
> >>> io->bio.bi_iter.bi_sector = KEY_START(&w->key);
> >>> bio_set_dev(&io->bio, io->dc->bdev);
> >>> io->bio.bi_end_io = dirty_endio;
> >>>
> >>
>
Powered by blists - more mailing lists