lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C0358B1.1050605@uni-konstanz.de>
Date:	Mon, 31 May 2010 08:35:29 +0200
From:	Kay Diederichs <kay.diederichs@...-konstanz.de>
To:	tytso@....edu, "Jayson R. King" <dev@...sonking.com>,
	Stable team <stable@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Dave Chinner <david@...morbit.com>,
	Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 2.6.27.y 1/3] ext4: Use our own write_cache_pages()

Am 30.05.2010 23:25, schrieb tytso@....edu:
> On Fri, May 28, 2010 at 08:41:44PM -0500, Jayson R. King wrote:
>>
>> The difference is that, 2.6.27's write_cache_pages() in
>> page-writeback.c still updates wbc->nr_to_write, since the patch
>> which changed that behavior was dropped from .27-rc2 due to the XFS
>> regression it causes on mainline. ext4 appears to want the behavior
>> of write_cache_pages which does not update wbc->nr_to_write. This
>> write_cache_pages_da() does what ext4 wants, without introducing the
>> XFS regression. So I believe it is needed.
>
> Ah, OK.  So I understand the motivation now, and that's a valid
> concern.  The question is now: how much the goal of the 2.6.27 stable
> branch to fix bugs, and how much is it to get the best possible
> performance, at least with respect to ext4?  It's going to be harder
> and harder to backport fixes to 2.6.27, and I can speak from
> experience that it's very easy to introduce regressions while trying
> to do backports, since sometimes an individual upstream commit can end
> up introducing a regression, and while we do try to document
> regression fixes in later commits, sometimes the documentation isn't
> complete.
>
> I just spent the better part of a day trying to fix up a backport
> series for 2.6.32.  When I was engaged in this particular exercise, it
> turns out a particular commit to fix a quota deadlock introduced a
> regression, and the fix to that introduced yet another, and there were
> three or four patches that all needed to be pulled in at once.  Except
> initially I missed one, and that caused an i_blocks corruption issue
> when using fallocate() that took me several hours and a reverse
> git-bisection to find.  (And this is one set of fixes that will
> probably never be able to go into 2.6.27.y, since these changes also
> interlock with probably a dozen or so quota changes that have also
> gone in over the last couple of kernel releases.)
>
> I'll also add that simply testing using dbench, as you said you used
> in another e-mail message, really isn't good enough to find all
> possible regressions (it wouldn't have found the i_blocks corruption
> problem in my initial set of 2.6.32 ext4 backports patches, for
> example, since dbench only tests a very limited set of fs operations,
> which doesn't include fallocate, or quotas, or mmap for that matter.)
>
> What I would recommend is using the XFSQA (also sometimes known
> xfstests) test suite to make sure that your changes are sound.  Dbench
> will sometimes find issues, yes, but in my experience it's not the
> best tool.  The fsstress program, which is called in a number of
> different configurations by xfstests, has found all sorts of problems
> that other thing shaven't been able to find.  Run it on at least a
> 2-core system, or preferably a 4-core or 8-core system if you have it.
> I generally run tests using both 4k and 1k blocksize file systems to
> make sure there aren't problems where the fs blocksize is less than
> the pagesize.
>
> If you are willing to take on the support burden of ext4 for 2.6.27,
> and do a lot of testing, I at least wouldn't have any objection to
> these patches.  It's really a question of risk vs. reward for the
> users of the 2.6.27 stable tree, plus a question of someone willing to
> take on the support/debugging burden, and how much testing is done to
> appropriate tilt the risk/reward balance.
>
> Regards,
>
> 						- Ted

For what it's worth: my 2.6.27.45 fileservers deadlock reproducibly 
after 1 to 2 minutes of heavy NFS load, when using ext4 (never had a 
problem with ext3). Jayson King's patch series (posted Feb 27) fixed 
this, and I've been running it since May 1 without problems.

 From my experience, I'd say that the ext4 deadlock needs to be fixed; 
otherwise ext4 in 2.6.27 should not be called stable.

best wishes,
Kay


Download attachment "smime.p7s" of type "application/pkcs7-signature" (4756 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ