linux-kernel - Re: oomkillers gone wild.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.02.1206082232570.3086@ionos>
Date:	Fri, 8 Jun 2012 22:37:15 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	David Rientjes <rientjes@...gle.com>
cc:	Dave Jones <davej@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: oomkillers gone wild.

On Fri, 8 Jun 2012, David Rientjes wrote:
> On Tue, 5 Jun 2012, Dave Jones wrote:
> 
> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
> > 142524 142420  99%    9.67K  47510	  3   1520320K task_struct
> > 142560 142417  99%    1.75K   7920	 18    253440K signal_cache
> > 142428 142302  99%    1.19K   5478	 26    175296K task_xstate
> > 306064 289292  94%    0.36K   6956	 44    111296K debug_objects_cache
> > 143488 143306  99%    0.50K   4484	 32     71744K cred_jar
> > 142560 142421  99%    0.50K   4455       32     71280K task_delay_info
> > 150753 145021  96%    0.45K   4308	 35     68928K kmalloc-128
> > 
> > Why so many task_structs ? There's only 128 processes running, and most of them
> > are kernel threads.
> > 
> 
> Do you have CONFIG_OPROFILE enabled?
> 
> > /sys/kernel/slab/task_struct/alloc_calls shows..
> > 
> >  142421 copy_process.part.21+0xbb/0x1790 age=8/19929576/48173720 pid=0-16867 cpus=0-7
> > 
> > I get the impression that the oom-killer hasn't cleaned up properly after killing some of
> > those forked processes.
> > 
> > any thoughts ?
> > 
> 
> If we're leaking task_struct's, meaning that put_task_struct() isn't 
> actually freeing them when the refcount goes to 0, then it's certainly not 
> because of the oom killer which only sends a SIGKILL to the selected 
> process.

I rather suspect, that this is a asymetry between get_ and
put_task_struct and refcount just doesn't go to zero.

Thanks,

	tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/