lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20191203193956.fe0ab4c2ec7eba4e55a3de89@gmail.com>
Date:   Tue, 3 Dec 2019 19:39:56 +0800
From:   kungf <wings.wyang@...il.com>
To:     Coly Li <colyli@...e.de>
Cc:     kent.overstreet@...il.com, linux-bcache@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bcache: add REQ_FUA to avoid data lost in writeback
 mode

On Mon, 2 Dec 2019 19:08:59 +0800
Coly Li <colyli@...e.de> wrote:

> On 2019/12/2 6:24 pm, kungf wrote:
> > data may lost when in the follow scene of writeback mode:
> > 1. client write data1 to bcache
> > 2. client fdatasync
> > 3. bcache flush cache set and backing device
> > if now data1 was not writed back to backing, it was only guaranteed safe in cache.
> > 4.then cache writeback data1 to backing with only REQ_OP_WRITE
> > So data1 was not guaranteed in non-volatile storage,  it may lost if  power interruptionĀ 
> > 
> 
> Hi,
> 
> Do you encounter such problem in real work load ? With bcache journal, I
> don't see the possibility of data lost with your description.
> 
> Correct me if I am wrong.
> 
> Coly Li

Hi Coly?

Sorry to confuse you. As i known now, write_dirty function write dirty to backing without FUA, and write_dirty_finish make dirty key clean,
it means the data indexed by the key will not be writeback again, am i wrong?
I only find that the backing device will be flushed when bcache get an PREFLUSH bio, any other place  it will be flushed in journal?

I made a test that write bcache with dd, and then detach it, blktrace the cache and backing device at the same time.
1. close writeback
# echo 0 > /sys/block/bcache0/bcache/writeback_running
2. write data with a fdatasync
#dd if=/dev/zero of=/dev/bcache0 bs=16k count=1 oflag=direct
3. detach and trigger writeback
#echo b1f40ca5-37a3-4852-9abf-6abed96d71db >/sys/block/bcache0/bcache/detach

the blow text is blkparse result.
from cache blktrace blow, we can see 16k data write to cache set, and then flush with op FWFSM(PREFLUSH|WRITE|FUA|SYNC|META)
```
  8,160 33        1     0.000000000 222844  A   W 630609920 + 32 <- (8,167) 1464320
  8,167 33        2     0.000000478 222844  Q   W 630609920 + 32 [dd]
  8,167 33        3     0.000006167 222844  G   W 630609920 + 32 [dd]
  8,167 33        5     0.000011385 222844  I   W 630609920 + 32 [dd]
  8,167 33        6     0.000023890   948  D   W 630609920 + 32 [kworker/33:1H]
  8,167 33        7     0.000111203     0  C   W 630609920 + 32 [0]
  8,160 34        1     0.000167029 215616  A FWFSM 629153808 + 8 <- (8,167) 8208
  8,167 34        2     0.000167490 215616  Q FWFSM 629153808 + 8 [kworker/34:2]
  8,167 34        3     0.000169061 215616  G FWFSM 629153808 + 8 [kworker/34:2]
  8,167 34        4     0.000301308   949  D WFSM 629153808 + 8 [kworker/34:1H]
  8,167 34        5     0.000348832     0  C WFSM 629153808 + 8 [0]
  8,167 34        6     0.000349612     0  C WFSM 629153808 [0]
```

from backing blktrace blow, the backing device first get flush op FWS(PERFLUSH|WRITE|SYNC) because of we stop writeback, then get W op after detach,
the 16k data was writeback to backing device, and after this, the backing device never get flush op, it means that the 16k data we write it's not safe
in backing device, even we had write with fdatasync.
```
  8,144 33        1     0.000000000 222844  Q WSM 8 + 8 [dd]
  8,144 33        2     0.000016609 222844  G WSM 8 + 8 [dd]
  8,144 33        5     0.000020710 222844  I WSM 8 + 8 [dd]
  8,144 33        6     0.000031967   948  D WSM 8 + 8 [kworker/33:1H]
  8,144 33        7     0.000152945 88631  C  WS 16 + 32 [0]
  8,144 34        1     0.000186127 215616  Q FWS [kworker/34:2]
  8,144 34        2     0.000187006 215616  G FWS [kworker/34:2]
  8,144 33        8     0.000326761     0  C WSM 8 + 8 [0]
  8,144 34        3     0.020195027     0  C  WS 16 [0]
  8,144 34        4     0.020195904     0  C FWS 16 [0]
  8,144 23        1    19.415130395 215884  Q   W 16 + 32 [kworker/23:2]
  8,144 23        2    19.415132072 215884  G   W 16 + 32 [kworker/23:2]
  8,144 23        3    19.415133134 215884  I   W 16 + 32 [kworker/23:2]
  8,144 23        4    19.415137776  1215  D   W 16 + 32 [kworker/23:1H]
  8,144 23        5    19.416607260     0  C   W 16 + 32 [0]
  8,144 24        1    19.416640754 222593  Q WSM 8 + 8 [bcache_writebac]
  8,144 24        2    19.416642698 222593  G WSM 8 + 8 [bcache_writebac]
  8,144 24        3    19.416643505 222593  I WSM 8 + 8 [bcache_writebac]
  8,144 24        4    19.416650589  1107  D WSM 8 + 8 [kworker/24:1H]
  8,144 24        5    19.416865258     0  C WSM 8 + 8 [0]
  8,144 24        6    19.416871350 221889  Q WSM 8 + 8 [kworker/24:1]
  8,144 24        7    19.416872201 221889  G WSM 8 + 8 [kworker/24:1]
  8,144 24        8    19.416872542 221889  I WSM 8 + 8 [kworker/24:1]
  8,144 24        9    19.416875458  1107  D WSM 8 + 8 [kworker/24:1H]
  8,144 24       10    19.417076935     0  C WSM 8 + 8 [0]
```
> 
> > Signed-off-by: kungf <wings.wyang@...il.com>
> > ---
> >  drivers/md/bcache/writeback.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> > index 4a40f9eadeaf..e5cecb60569e 100644
> > --- a/drivers/md/bcache/writeback.c
> > +++ b/drivers/md/bcache/writeback.c
> > @@ -357,7 +357,7 @@ static void write_dirty(struct closure *cl)
> >  	 */
> >  	if (KEY_DIRTY(&w->key)) {
> >  		dirty_init(w);
> > -		bio_set_op_attrs(&io->bio, REQ_OP_WRITE, 0);
> > +		bio_set_op_attrs(&io->bio, REQ_OP_WRITE | REQ_FUA, 0);
> >  		io->bio.bi_iter.bi_sector = KEY_START(&w->key);
> >  		bio_set_dev(&io->bio, io->dc->bdev);
> >  		io->bio.bi_end_io	= dirty_endio;
> > 
> 
-- 
kungf <wings.wyang@...il.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ