lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1306131330170.8686@chino.kir.corp.google.com>
Date:	Thu, 13 Jun 2013 13:34:46 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Michal Hocko <mhocko@...e.cz>
cc:	Johannes Weiner <hannes@...xchg.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	linux-mm@...ck.org, cgroups@...r.kernel.org,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 2/2] memcg: do not sleep on OOM waitqueue with full charge
 context

On Thu, 13 Jun 2013, Michal Hocko wrote:

> > Right now it appears that that number of users is 0 and we're talking 
> > about a problem that was reported in 3.2 that was released a year and a 
> > half ago.  The rules of inclusion in stable also prohibit such a change 
> > from being backported, specifically "It must fix a real bug that bothers 
> > people (not a, "This could be a problem..." type thing)".
> 
> As you can see there is an user seeing this in 3.2. The bug is _real_ and
> I do not see what you are objecting against. Do you really think that
> sitting on a time bomb is preferred more?
> 

Nobody has reported the problem in seven months.  You're patching a kernel 
that's 18 months old.  Your "user" hasn't even bothered to respond to your 
backport.  This isn't a timebomb.

> > We have deployed memcg on a very large number of machines and I can run a 
> > query over all software watchdog timeouts that have occurred by 
> > deadlocking on i_mutex during memcg oom.  It returns 0 results.
> 
> Do you capture /prc/<pid>/stack for each of them to find that your
> deadlock (and you have reported that they happen) was in fact caused by
> a locking issue? These kind of deadlocks might got unnoticed especially
> when the oom is handled by userspace by increasing the limit (my mmecg
> is stuck and increasing the limit a bit always helped).
> 

We dump stack traces for every thread on the system to the kernel log for 
a software watchdog timeout and capture it over the network for searching 
later.  We have not experienced any deadlock that even remotely resembles 
the stack traces in the chnagelog.  We do not reproduce this issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ