lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.0.9999.0712012321150.7852@chino.kir.corp.google.com>
Date:	Sun, 2 Dec 2007 07:52:54 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	Ingo Oeser <ioe-lkml@...eria.de>
cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Arjan van de Ven <arjan@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks

On Sun, 2 Dec 2007, Ingo Oeser wrote:

> > maybe, but we'd have to see how often this gets triggered. An OOM is 
> > something that could happen in any overloaded system - while a hung task 
> > is likely due to a kernel bug.
> 
> What about a client using hard mounted NFS shares here? That shouldn't be
> killed by the OOM killer in that situation, should it?
> 

That's orthogonal to the point I was making; the problem with the OOM 
killer right now is that it can easily enter an infinite loop in out of 
memory conditions if the task that it has selected to be killed fails to 
exit.  This only happens when the task hangs in TASK_UNINTERRUPTIBLE state 
and doesn't respond to the SIGKILL that the OOM killer has sent it.

That behavior is a consequence of trying to avoid needlessly killing tasks 
by giving already-killed tasks time to exit in subsequent OOM conditions.  
During the tasklist scan of eligible tasks to kill, if any task is found 
to have access to memory reserves that only the OOM killer can provide 
(signified by the TIF_MEMDIE thread flag) and it has not yet died, the OOM 
killer becomes a complete no-op.

This happens on occasion and completely deadlocks the system because the 
out of memory condition will never be alleviated.  With the hang detection 
addition to lockdep, it would be easy to correct this situation.  I 
understand the primary purpose of the patch is to identify potential 
kernel bugs that aren't hardware induced, but I think it has relevance to 
the OOM killer problem until such time as tasks hanging in 
TASK_UNINTERRUPTIBLE state becomes passe.

		David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ