linux-ext4 - Re: ext4 xfstest regression due to ext4_es_lookup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130224032156.GA5840@gmail.com>
Date:	Sun, 24 Feb 2013 11:21:56 +0800
From:	Zheng Liu <gnehzuil.liu@...il.com>
To:	Theodore Ts'o <tytso@....edu>
Cc:	Dmitry Monakhov <dmonakhov@...nvz.org>, linux-ext4@...r.kernel.org
Subject: Re: ext4 xfstest regression due to ext4_es_lookup_extent

On Sat, Feb 23, 2013 at 07:14:47PM -0500, Theodore Ts'o wrote:
> On Sat, Feb 23, 2013 at 06:00:35PM +0800, Zheng Liu wrote:
> > > Actually I think that the regression in 269'th you have found recently
> > > caused by similar issue and commit which you foud by bisecting ( the one
> > > which allow migration between indirect<->extent based inodes)
> > > simply helps to spot real issue in es_caching code.
> > 
> > I will revise this patch.  IIRC, we forgot to update status tree after
> > an inode is migrated from extent-based to indirect-based.  Thanks for
> > pointing out.
> 
> Can you do this as a new commit?  I've already bumped the master
> pointer up since I finished running xfstests and I'm seeing no
> regressions (at least with my set of xfstests).  So given that
> everything has been tested and things looks pretty stable, I pushed up
> the master branch.

Yes, I will prepare it as a new commit.  But I am not pretty sure that
the root cause is es_caching.

> 
> I did remember that you were still working on this regression, but
> since we're already half-way through the merge window, I really want
> to make things are ready for a merge request to Linus.  (Which I
> probably will be sending to Linus by Monday or Tuesday.)
> 
> I do plan to collect bug fixes and any remaining regression fixes to
> push to Linus by -rc2 or -rc3, so if don't rush fixing up defrag
> functionality.

For defrag regression, I have two choice to fix it.  One is a quick but
sub-optimal fix that we can invalidate all written/unwritten/hole extent
from status tree.  But it will decrease the performance because we need
to load extent into status tree again.  Further, one thing we need to
keep in our mind is that some extent is unwritten and delayed.  So it
makes thing complicated.  But now we don't need to worry about it
because a bigalloc file system doesn't support defrag.  So we are safe.

Another is to update all extent in status tree.  I think this is a
better choice and I think Dmirty is working on it.  Dmitry, I don't get
your response.  Could you confirm it?

TBH, we never use migration and defrag feature in our product system.
I admit that I almost don't pay a attention to them.  It's my fault.

I make a plan for the next works.

1. Try to prepare a patch that invalidates all cache in status tree to
fix defrag regression, and wait Dmitry's patch.

2. Revise migration patch.

3. Submit remain patches for extent status tree that try to convert
unwritten extent in end_io callback function and remove a bogus wait in
ext4_ind_direct_IO.  Now the patch has already done and still need to be
tested.

4. get_block_t and *map_blocks cleanup.

5. extent-level locking.

Any comment?

Thanks,
                                                - Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html