linux-kernel - Re: [Bug 9182] Critical memory leak (dirty pages)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20071216015112.d0ab08a1.akpm@linux-foundation.org>
Date:	Sun, 16 Dec 2007 01:51:12 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Krzysztof Oledzki <olel@....pl>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Osterried <osterried@...se.de>, protasnb@...il.com,
	bugme-daemon@...zilla.kernel.org,
	Thomas Osterried <osterried@...se.de>
Subject: Re: [Bug 9182] Critical memory leak (dirty pages)

On Sun, 16 Dec 2007 10:33:20 +0100 (CET) Krzysztof Oledzki <olel@....pl> wrote:

> 
> 
> On Sat, 15 Dec 2007, Andrew Morton wrote:
> 
> > On Sun, 16 Dec 2007 00:08:52 +0100 (CET) Krzysztof Oledzki <olel@....pl> wrote:
> >
> >>
> >>
> >> On Sat, 15 Dec 2007, bugme-daemon@...zilla.kernel.org wrote:
> >>
> >>> http://bugzilla.kernel.org/show_bug.cgi?id=9182
> >>>
> >>>
> >>> ------- Comment #33 from protasnb@...il.com  2007-12-15 14:19 -------
> >>> Krzysztof, I'd hate point you to a hard path (at least time consuming), but
> >>> you've done a lot of digging by now anyway. How about git bisecting between
> >>> 2.6.20-rc2 and rc1? Here is great info on bisecting:
> >>> http://www.kernel.org/doc/local/git-quick.html
> >>
> >> As I'm smarter than git-bistect I can tell that 2.6.20-rc1-git8 is as bad
> >> as 2.6.20-rc2 but 2.6.20-rc1-git8 with one patch reverted seems to be OK.
> >> So it took me only 2 reboots. ;)
> >>
> >> The guilty patch is the one I proposed just an hour ago:
> >>   http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fstable%2Flinux-2.6.20.y.git;a=commitdiff_plain;h=fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9
> >>
> >> So:
> >>   - 2.6.20-rc1: OK
> >>   - 2.6.20-rc1-git8 with fba2591bf4e418b6c3f9f8794c9dd8fe40ae7bd9 reverted: OK
> >>   - 2.6.20-rc1-git8: very BAD
> >>   - 2.6.20-rc2: very BAD
> >>   - 2.6.20-rc4: very BAD
> >>   - >= 2.6.20: BAD (but not *very* BAD!)
> >>
> >
> > well..  We have code which has been used by *everyone* for a year and it's
> > misbehaving for you alone.
> 
> No, not for me alone. Probably only I and Thomas Osterried have systems 
> where it is so easy to reproduce. Please note that the problem exists on 
> my all systems, but only on one it is critical. It is enough to run
> "sync; sleep 1; sunc; sleep 1; sync; grep Drirty /proc/meminfo" to be sure. 
> With =>2.6.20-rc1-git8 it *never* falls to 0 an *all* my hosts but only 
> on one it goes to ~200MB in about 2 weeks and then everything dies:
> http://bugzilla.kernel.org/attachment.cgi?id=13824
> http://bugzilla.kernel.org/attachment.cgi?id=13825
> http://bugzilla.kernel.org/attachment.cgi?id=13826
> http://bugzilla.kernel.org/attachment.cgi?id=13827
> 
> >  I wonder what you're doing that is different/special.
> Me to. :|
> 
> > Which filesystem, which mount options
> 
>   - ext3 on RAID1 (MD): / - rootflags=data=journal

It wouldn't surprise me if this is specific to data=journal: that
journalling mode is pretty complex wrt dairty-data handling and isn't well
tested.

Does switching that to data=writeback change things?

THomas, do you have ext3 data=journal on any filesytems?

>   - ext3 on LVM on RAID5 (MD)
>   - nfs
> 
> /dev/md0 on / type ext3 (rw)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec)
> devpts on /dev/pts type devpts (rw,nosuid,noexec)
> /dev/mapper/VolGrp0-usr on /usr type ext3 (rw,nodev,data=journal)
> /dev/mapper/VolGrp0-var on /var type ext3 (rw,nodev,data=journal)
> /dev/mapper/VolGrp0-squid_spool on /var/cache/squid/cd0 type ext3 (rw,nosuid,nodev,noatime,data=writeback)
> /dev/mapper/VolGrp0-squid_spool2 on /var/cache/squid/cd1 type ext3 (rw,nosuid,nodev,noatime,data=writeback)
> /dev/mapper/VolGrp0-news_spool on /var/spool/news type ext3 (rw,nosuid,nodev,noatime)
> shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev)
> usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85)
> owl:/usr/gentoo-nfs on /usr/gentoo-nfs type nfs (ro,nosuid,nodev,noatime,bg,intr,tcp,addr=192.168.129.26)
> 
> 
> > what sort of workload?
> Different, depending on a host: mail (postfix + amavisd + spamassasin + 
> clamav + sqlgray), squid, mysql, apache, nfs, rsync, .... But it seems 
> that the biggest problem is on the host running mentioned mail service.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/