lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150323231539.GA4135@wallace>
Date:	Mon, 23 Mar 2015 19:15:39 -0400
From:	Eric Whitney <enwlinux@...il.com>
To:	Lukáš Czerner <lczerner@...hat.com>
Cc:	Eric Whitney <enwlinux@...il.com>, linux-ext4@...r.kernel.org,
	tytso@....edu
Subject: Re: [PATCH] ext4: don't consume reserved space when allocating
 partial cluster

* Lukáš Czerner <lczerner@...hat.com>:
> On Mon, 16 Mar 2015, Eric Whitney wrote:
> 
> > Date: Mon, 16 Mar 2015 21:20:09 -0400
> > From: Eric Whitney <enwlinux@...il.com>
> > To: linux-ext4@...r.kernel.org
> > Cc: tytso@....edu
> > Subject: [PATCH] ext4: don't consume reserved space when allocating partial
> >     cluster
> > 
> > When xfstests' auto group is run on a bigalloc filesystem with a
> > 4.0-rc3 kernel, e2fsck failures and kernel warnings occur for some
> > tests. e2fsck reports incorrect iblocks values, and the warnings
> > indicate that the space reserved by delayed allocation is being
> > overdrawn at allocation time.
> > 
> > Some of these errors occur because the reserved space is incorrectly
> > decreased by one cluster when ext4_ext_map_blocks satisfies an
> > allocation request by using an unused portion of a previously allocated
> > cluster.  Because a cluster's worth of reserved space was already
> > removed when it was first allocated, it should not be removed again.
> 
> Hi Eric,
> 
> I am not sure I understand. What do you mean by saying that the
> space was already removed when it was first allocated ?

Hi Lukas:

I'm sorry that was confusing - I didn't get the terminology quite right,
given the usage in the code.  What I'm discussing in that sentence is
the space reserved for delayed allocation.  Instead of "removed", I should
have said "released".  If we're mapping from an existing cluster, at some
point in the past that cluster was allocated, and at that time the space
reservation for that cluster would have been released.  So, we ought not
to be releasing its space again.

> 
> From my point of view the ext4_da_update_reserve_space() call is ok,
> because we're going to use blocks from already allocated cluster, so
> we do not want to account for quota in this case, because that has
> already been done when the cluster was allocated. The rest is just
> updating reservations and the dirty clusters counter which needs to
> be done in any case. But I might be actually missing something, am I
> ?

I agree that we don't want to account for quota, as that should have been
done in the past when the cluster was first allocated.  I think we don't
want to update the reservations or the dirty clusters counter because that
should also have been taken into account at the same time in the past.  If
we update them again, decreasing them once more for the cluster we're currently
processing, we'll be double accounting for the space.

The code in ext4_da_map_blocks() that runs at write begin time and increases
the amount of reserved space only does so when a cluster has not been
previously allocated or already accounted for as part of a delalloc extent
recorded in the status tree.  I think it should be accurately reflecting the
number of clusters we'll eventually need to allocate for data, so there's no
room for double counting when mapping from an existing cluster in
ext4_ext_map_blocks().

If I'm not reading the delalloc accounting code incorrectly, a few more patches
will likely be required to remove some of the code immediately following
if (!map_from_cluster) and a chunk in ext4_ext_handle_unwritten_extents().

Thanks,
Eric


> 
> Thanks!
> -Lukas
> 
> > 
> > This patch appears to correct the e2fsck failure reported for
> > generic/232 and the kernel warnings produced by ext4/001, generic/009,
> > and generic/033.  Failures and warnings for some other tests remain to
> > be addressed.
> > 
> > Signed-off-by: Eric Whitney <enwlinux@...il.com>
> > ---
> >  fs/ext4/extents.c | 14 +-------------
> >  1 file changed, 1 insertion(+), 13 deletions(-)
> > 
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index bed4308..554190e 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -4535,19 +4535,7 @@ got_allocated_blocks:
> >  		 */
> >  		reserved_clusters = get_reserved_cluster_alloc(inode,
> >  						map->m_lblk, allocated);
> > -		if (map_from_cluster) {
> > -			if (reserved_clusters) {
> > -				/*
> > -				 * We have clusters reserved for this range.
> > -				 * But since we are not doing actual allocation
> > -				 * and are simply using blocks from previously
> > -				 * allocated cluster, we should release the
> > -				 * reservation and not claim quota.
> > -				 */
> > -				ext4_da_update_reserve_space(inode,
> > -						reserved_clusters, 0);
> > -			}
> > -		} else {
> > +		if (!map_from_cluster) {
> >  			BUG_ON(allocated_clusters < reserved_clusters);
> >  			if (reserved_clusters < allocated_clusters) {
> >  				struct ext4_inode_info *ei = EXT4_I(inode);
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ