linux-kernel - Re: [Bug 9182] Critical memory leak (dirty pages)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0712152223150.14491@bizon.gios.gov.pl>
Date:	Sat, 15 Dec 2007 22:53:33 +0100 (CET)
From:	Krzysztof Oledzki <olel@....pl>
To:	Peter Zijlstra <peterz@...radead.org>
cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Osterried <osterried@...se.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	bugme-daemon@...zilla.kernel.org
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)


http://bugzilla.kernel.org/show_bug.cgi?id=9182


On Sat, 15 Dec 2007, Krzysztof Oledzki wrote:

>
>
> On Thu, 13 Dec 2007, Krzysztof Oledzki wrote:
>
>> 
>> 
>> On Thu, 13 Dec 2007, Peter Zijlstra wrote:
>> 
>>> 
>>> On Thu, 2007-12-13 at 16:17 +0100, Krzysztof Oledzki wrote:
>>>> 
>>> 
>>>> BTW: Could someone please look at this problem? I feel little ignored and
>>>> in my situation this is a critical regression.
>>> 
>>> I was hoping to get around to it today, but I guess tomorrow will have
>>> to do :-/
>> 
>> Thanks.
>> 
>>> So, its ext3, dirty some pages, sync, and dirty doesn't fall to 0,
>>> right?
>> 
>> Not only doesn't fall but continuously grows.
>> 
>>> Does it happen with other filesystems as well?
>> 
>> Don't know. I generally only use ext3 and I'm afraid I'm not able to switch 
>> this system to other filesystem.
>> 
>>> What are you ext3 mount options?
>> /dev/root / ext3 rw,data=journal 0 0
>> /dev/VolGrp0/usr /usr ext3 rw,nodev,data=journal 0 0
>> /dev/VolGrp0/var /var ext3 rw,nodev,data=journal 0 0
>> /dev/VolGrp0/squid_spool /var/cache/squid/cd0 ext3 
>> rw,nosuid,nodev,noatime,data=writeback 0 0
>> /dev/VolGrp0/squid_spool2 /var/cache/squid/cd1 ext3 
>> rw,nosuid,nodev,noatime,data=writeback 0 0
>> /dev/VolGrp0/news_spool /var/spool/news ext3 
>> rw,nosuid,nodev,noatime,data=ordered 0 0
>
> BTW: this regression also exists in 2.6.24-rc5. I'll try to find when it was 
> introduced but it is hard to do it on a highly critical production system, 
> especially since it takes ~2h after a reboot, to be sure.
>
> However, 2h is quite good time, on other systems I have to wait ~2 months to 
> get 20MB of leaked memory:
>
> # uptime
> 13:29:34 up 58 days, 13:04,  9 users,  load average: 0.38, 0.27, 0.31
>
> # sync;sync;sleep 1;sync;grep Dirt /proc/meminfo
> Dirty:           23820 kB

More news, I hope this time my problem get more attention from developers 
since now I have much more information.

So far I found that:
  - 2.6.20-rc4 - bad: http://bugzilla.kernel.org/attachment.cgi?id=14057
  - 2.6.20-rc2 - bad: http://bugzilla.kernel.org/attachment.cgi?id=14058
  - 2.6.20-rc1 - OK (probably, I need to wait little more to be 100% sure).

2.6.20-rc1 with 33m uptime:
~$ grep Dirt /proc/meminfo ;sync ; sleep 1 ; sync ; grep Dirt /proc/meminfo
Dirty:           10504 kB
Dirty:               0 kB

2.6.20-rc2 was released Dec 23/24 2006 (BAD)
2.6.20-rc1 was released Dec 13/14 2006 (GOOD?)

It seems that this bug was introduced exactly one year ago. Surprisingly, 
dirty memory in 2.6.20-rc2/2.6.20-rc4 leaks _much_ more faster than in 
2.6.20-final and later kernels as it took only about 6h to reach 172MB. 
So, this bug might be cured afterward, but only a little.

There are three commits that may be somehow related:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git;a=commitdiff;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git;a=commitdiff;h=3e67c0987d7567ad666641164a153dca9a43b11d
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.20.y.git;a=commitdiff;h=5f2a105d5e33a038a717995d2738434f9c25aed2

I'm going to check 2.6.20-rc1-git... releases but it would be *very* nice 
if someone could finally give ma a hand and point some hints helping 
debugging this problem.

Please note that none of my systems with kernels >= 2.6.20-rc1 is able to 
reach 0 kb of dirty memory, even after many synces, even when idle.

Best regards,

 				Krzysztof Olędzki