lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 23 Mar 2015 13:55:08 -0400
From:	Eric Whitney <enwlinux@...il.com>
To:	Lukáš Czerner <lczerner@...hat.com>
Cc:	Eric Whitney <enwlinux@...il.com>, linux-ext4@...r.kernel.org,
	tytso@....edu
Subject: Re: [PATCH] ext4: fix loss of delalloc extent info in
 ext4_zero_range()

* Lukáš Czerner <lczerner@...hat.com>:
> On Fri, 20 Mar 2015, Eric Whitney wrote:
> 
> > Date: Fri, 20 Mar 2015 19:53:50 -0400
> > From: Eric Whitney <enwlinux@...il.com>
> > To: linux-ext4@...r.kernel.org
> > Cc: tytso@....edu
> > Subject: [PATCH] ext4: fix loss of delalloc extent info in ext4_zero_range()
> > 
> > In ext4_zero_range(), removing a file's entire block range from the
> > extent status tree removes all records of that file's delalloc extents.
> > The delalloc accounting code uses this information, and its loss can
> > then lead to accounting errors and kernel warnings at writeback time and
> > subsequent file system damage.  This is most noticeable on bigalloc
> > file systems where code in ext4_ext_map_blocks() handles cases where
> > delalloc extents share clusters with a newly allocated extent.
> > 
> > Because we're not deleting a block range and are correctly updating the
> > status of its associated extent, there is no need to remove anything
> > from the extent status tree.
> > 
> > When this patch is combined with an unrelated bug fix for
> > ext4_zero_range(), kernel warnings and e2fsck errors reported during
> > xfstests runs on bigalloc filesystems are greatly reduced without
> > introducing regressions on other xfstests-bld test scenarios.
> 
> Ah, this is my bad sorry. I didn't realize that we're actually
> relying on the delayed extent information in the extent status tree
> now.
> 
> However I remember that I've seen some problems when this extent
> removal was not there (see the comment you removed). I am not
> entirely sure anymore what it was all about, but I need to retest
> with your patch.
> 

Hi Lukas:

For a little more context, the unrelated bug fix I referenced is in fact
your pending zero_range fix.  Without it, xfstests-bld runs on a kernel
with this patch applied will report test failures for generic/091 pretty
consistently and for generic/127 occasionally on various test scenarios
(including 4k, ext4, bigalloc, etc.).  With your patch, things look pretty
good to me.  (It's still possible to encounter occasional kernel warnings
related to delalloc reserved space accounting during bigalloc runs, but I
think those are related to another area I'm working on.)

We don't appear to have a record of the problems mentioned in the comment
I removed, and Ted doesn't recall what he might have seen.  Since that
code was introduced a couple of releases ago, I'm speculating he might
possibly have been seeing the xfstests failures described above before you
posted your patch.

Any test results you can share would be very welcome - taken together, these
two patches address a bunch of test failures and kernel warning noise on
bigalloc.  There are developers who want to do some ext4 SMR work leveraging
bigalloc right now, so if we can clear that up it will make things easier for
them.

Thanks,
Eric




> Thanks!
> -Lukas
> 
> > 
> > Signed-off-by: Eric Whitney <enwlinux@...il.com>
> > ---
> >  fs/ext4/extents.c | 13 -------------
> >  1 file changed, 13 deletions(-)
> > 
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index bed4308..c187cc3 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -4847,19 +4847,6 @@ static long ext4_zero_range(struct file *file, loff_t offset,
> >  					     flags, mode);
> >  		if (ret)
> >  			goto out_dio;
> > -		/*
> > -		 * Remove entire range from the extent status tree.
> > -		 *
> > -		 * ext4_es_remove_extent(inode, lblk, max_blocks) is
> > -		 * NOT sufficient.  I'm not sure why this is the case,
> > -		 * but let's be conservative and remove the extent
> > -		 * status tree for the entire inode.  There should be
> > -		 * no outstanding delalloc extents thanks to the
> > -		 * filemap_write_and_wait_range() call above.
> > -		 */
> > -		ret = ext4_es_remove_extent(inode, 0, EXT_MAX_BLOCKS);
> > -		if (ret)
> > -			goto out_dio;
> >  	}
> >  	if (!partial_begin && !partial_end)
> >  		goto out_dio;
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ