lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 26 Sep 2019 14:46:53 -0700
From:   Jaegeuk Kim <jaegeuk@...nel.org>
To:     Chao Yu <yuchao0@...wei.com>, linux-kernel@...r.kernel.org,
        linux-f2fs-devel@...ts.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: fix to avoid data corruption by
 forbidding SSR overwrite

On 09/26, Eric Biggers wrote:
> On Fri, Aug 16, 2019 at 11:03:34AM +0800, Chao Yu wrote:
> > There is one case can cause data corruption.
> > 
> > - write 4k to fileA
> > - fsync fileA, 4k data is writebacked to lbaA
> > - write 4k to fileA
> > - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> > - write 4k to fileB
> > - kworker flush 4k to lbaA due to SSR
> > - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> > data
> > 
> > One solution is tracking all fsynced file's block history, and disallow
> > SSR overwrite on newly invalidated block on that file.
> > 
> > However, during recovery, no matter the dnode is flushed or fsynced, all
> > previous dnodes until last fsynced one in node chain can be recovered,
> > that means we need to record all block change in flushed dnode, which
> > will cause heavy cost, so let's just use simple fix by forbidding SSR
> > overwrite directly.
> > 
> > Signed-off-by: Chao Yu <yuchao0@...wei.com>
> > ---
> >  fs/f2fs/segment.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index 9d9d9a050d59..69b3b553ee6b 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
> >  		if (!f2fs_test_and_set_bit(offset, se->discard_map))
> >  			sbi->discard_blks--;
> >  
> > -		/* don't overwrite by SSR to keep node chain */
> > -		if (IS_NODESEG(se->type) &&
> > -				!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> > +		/*
> > +		 * SSR should never reuse block which is checkpointed
> > +		 * or newly invalidated.
> > +		 */
> > +		if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> >  			if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
> >  				se->ckpt_valid_blocks++;
> >  		}
> > -- 
> 
> FYI, this commit caused xfstests generic/064 to start failing:

Yup, I was looking at this.

> 
> $ kvm-xfstests -c f2fs generic/064
> ...
> generic/064 3s ... 	[13:36:37][    5.946293] run fstests generic/064 at 2019-09-26 13:36:37
>  [13:36:41]- output mismatch (see /results/f2fs/results-default/generic/064.out.bad)
>     --- tests/generic/064.out	2019-09-18 04:53:46.000000000 -0700
>     +++ /results/f2fs/results-default/generic/064.out.bad	2019-09-26 13:36:41.533018683 -0700
>     @@ -1,2 +1,3 @@
>      QA output created by 064
>      Extent count after inserts is in range
>     +extents mismatched before = 1 after = 50
>     ...
>     (Run 'diff -u /root/xfstests/tests/generic/064.out /results/f2fs/results-default/generic/064.out.bad'  to see the entire diff)
> Ran: generic/064
> Failures: generic/064
> Failed 1 of 1 tests

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ