lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 9 Oct 2009 18:58:59 -0700
From:	Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Daniel Lezcano <dlezcano@...ibm.com>, andrea@...share.com,
	Pavel Emelianov <xemul@...nvz.org>,
	Sukadev Bhattiprolu <sukadev@...ibm.com>,
	Linux Containers <containers@...ts.osdl.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: pidns memory leak

Eric W. Biederman [ebiederm@...ssion.com] wrote:
| Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com> writes:
| 
| > Andrea,
| >
| > We have been running a leak in child pid namespaces and some early debugging
| > points to the following commit:
| >
| >>> 	commit 7766755a2f249e7e0dabc5255a0a3d151ff79821
| >>> 	Author: Andrea Arcangeli <andrea@...e.de>
| >>> 	Date:   Mon Feb 4 22:29:21 2008 -0800
| >>>
| >
| > Reverting the commit seems to fix the leak but we need to do some more
| > analysis (like the lstat() question Daniel has).
| 
| Yes.
| 
| That entire path is an optimization.  It should not be needed for correct
| operation.  Although it may be responsible for some false positives.
| 
| > However I have a basic question regarding the commit - the log mentions:
| >
| > 	> do_exit->release_task->mark_inode_dirty_sync->schedule() (will never
| > 	> come back to run journal_stop)
| >
| > But release_task() calls shrink_dcache_parent() for a _procfs_ dentry. Does
| > journal_stop() apply to procfs also ?
| 
| The problem when the that PF_EXITING check was introduced is that
| shrink_dcache_parent could shrink dcache entries for other
| filesystems.  Last I looked that is no longer the case and we can
| remove that code.

Ok.

| As I recall proc_flush_task_mnt has a few other minor bugs as well that
| could cause problems.

Can you give me some more details on those bugs ? Reverting the commit
seems to fix the problem.

| 
| Ultimately what problems are you seeing?

We are leaking 'struct pid', proc_inode, and 'struct pid_namespace', when
container-init exits before its descendant processes. i.e when the
container-init zaps its descendants and waits for them, it calls the
proc_flush_task_mnt(), but then misses the shrink_dcache_parent() call due
to the above commit.

So the proc_inode is never deleted and the references to struct pid and
pid_namespace never go away. Details of the leak are buried in the
previous mail...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ