lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 16 Dec 2007 14:46:36 +0100 (CET)
From:	Krzysztof Oledzki <olel@....pl>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Osterried <osterried@...se.de>, protasnb@...il.com,
	bugme-daemon@...zilla.kernel.org,
	Thomas Osterried <osterried@...se.de>
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)



On Sun, 16 Dec 2007, Andrew Morton wrote:

> On Sun, 16 Dec 2007 10:33:20 +0100 (CET) Krzysztof Oledzki <olel@....pl> wrote:
>
>>
>>
>> On Sat, 15 Dec 2007, Andrew Morton wrote:
>>
>>> On Sun, 16 Dec 2007 00:08:52 +0100 (CET) Krzysztof Oledzki <olel@....pl> wrote:
>>>
>>>>
>>>>
>>>> On Sat, 15 Dec 2007, bugme-daemon@...zilla.kernel.org wrote:
>>>>
>>>>> http://bugzilla.kernel.org/show_bug.cgi?id=9182
>>>>>
>>>>>
>>>>> ------- Comment #33 from protasnb@...il.com  2007-12-15 14:19 -------
>>>>> Krzysztof, I'd hate point you to a hard path (at least time consuming), but
>>>>> you've done a lot of digging by now anyway. How about git bisecting between
>>>>> 2.6.20-rc2 and rc1? Here is great info on bisecting:
>>>>> http://www.kernel.org/doc/local/git-quick.html
>>>>
>>>> As I'm smarter than git-bistect I can tell that 2.6.20-rc1-git8 is as bad
>>>> as 2.6.20-rc2 but 2.6.20-rc1-git8 with one patch reverted seems to be OK.
>>>> So it took me only 2 reboots. ;)
>>>>
>>>> The guilty patch is the one I proposed just an hour ago:
>>>>   http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Flinux-2.6.20.y.git;a=commitdiff_plain;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9
>>>>
>>>> So:
>>>>   - 2.6.20-rc1: OK
>>>>   - 2.6.20-rc1-git8 with fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9 reverted: OK
>>>>   - 2.6.20-rc1-git8: very BAD
>>>>   - 2.6.20-rc2: very BAD
>>>>   - 2.6.20-rc4: very BAD
>>>>   - >= 2.6.20: BAD (but not *very* BAD!)
>>>>
>>>
>>> well..  We have code which has been used by *everyone* for a year and it's
>>> misbehaving for you alone.
>>
>> No, not for me alone. Probably only I and Thomas Osterried have systems
>> where it is so easy to reproduce. Please note that the problem exists on
>> my all systems, but only on one it is critical. It is enough to run
>> "sync; sleep 1; sunc; sleep 1; sync; grep Drirty /proc/meminfo" to be sure.
>> With =>2.6.20-rc1-git8 it *never* falls to 0 an *all* my hosts but only
>> on one it goes to ~200MB in about 2 weeks and then everything dies:
>> http://bugzilla.kernel.org/attachment.cgi?id=13824
>> http://bugzilla.kernel.org/attachment.cgi?id=13825
>> http://bugzilla.kernel.org/attachment.cgi?id=13826
>> http://bugzilla.kernel.org/attachment.cgi?id=13827
>>
>>>  I wonder what you're doing that is different/special.
>> Me to. :|
>>
>>> Which filesystem, which mount options
>>
>>   - ext3 on RAID1 (MD): / - rootflags=data=journal
>
> It wouldn't surprise me if this is specific to data=journal: that
> journalling mode is pretty complex wrt dairty-data handling and isn't well
> tested.
>
> Does switching that to data=writeback change things?

I'll confirm this tomorrow but it seems that even switching to 
data=ordered (AFAIK default o ext3) is indeed enough to cure this problem.

Two questions remain then: why system dies when dirty reaches ~200MB 
and what is wrong with ext3+data=journal with >=2.6.20-rc2?

Best regards,

 				Krzysztof Olędzki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ