linux-kernel - Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130709131029.GH20281@dhcp22.suse.cz>
Date:	Tue, 9 Jul 2013 15:10:29 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	azurIt <azurit@...ox.sk>
Cc:	Johannes Weiner <hannes@...xchg.org>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, cgroups mailinglist <cgroups@...r.kernel.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack
 on OOM

On Mon 08-07-13 01:42:24, azurIt wrote:
> > CC: "Michal Hocko" <mhocko@...e.cz>, linux-kernel@...r.kernel.org, linux-mm@...ck.org, "cgroups mailinglist" <cgroups@...r.kernel.org>, "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@...fujitsu.com>
> >On Fri, Jul 05, 2013 at 09:02:46PM +0200, azurIt wrote:
> >> >I looked at your debug messages but could not find anything that would
> >> >hint at a deadlock.  All tasks are stuck in the refrigerator, so I
> >> >assume you use the freezer cgroup and enabled it somehow?
> >> 
> >> 
> >> Yes, i'm really using freezer cgroup BUT i was checking if it's not
> >> doing problems - unfortunately, several days passed from that day
> >> and now i don't fully remember if i was checking it for both cases
> >> (unremoveabled cgroups and these freezed processes holding web
> >> server port). I'm 100% sure i was checking it for unremoveable
> >> cgroups but not so sure for the other problem (i had to act quickly
> >> in that case). Are you sure (from stacks) that freezer cgroup was
> >> enabled there?
> >
> >Yeah, all the traces without exception look like this:
> >
> >1372089762/23433/stack:[<ffffffff81080925>] refrigerator+0x95/0x160
> >1372089762/23433/stack:[<ffffffff8106ab7b>] get_signal_to_deliver+0x1cb/0x540
> >1372089762/23433/stack:[<ffffffff8100188b>] do_signal+0x6b/0x750
> >1372089762/23433/stack:[<ffffffff81001fc5>] do_notify_resume+0x55/0x80
> >1372089762/23433/stack:[<ffffffff815cac77>] int_signal+0x12/0x17
> >1372089762/23433/stack:[<ffffffffffffffff>] 0xffffffffffffffff
> >
> >so the freezer was already enabled when you took the backtraces.
> >
> >> Btw, what about that other stacks? I mean this file:
> >> http://watchdog.sk/lkml/memcg-bug-7.tar.gz
> >> 
> >> It was taken while running the kernel with your patch and from
> >> cgroup which was under unresolveable OOM (just like my very original
> >> problem).
> >
> >I looked at these traces too, but none of the tasks are stuck in rmdir
> >or the OOM path.  Some /are/ in the page fault path, but they are
> >happily doing reclaim and don't appear to be stuck.  So I'm having a
> >hard time matching this data to what you otherwise observed.

Agreed.

> >However, based on what you reported the most likely explanation for
> >the continued hangs is the unfinished OOM handling for which I sent
> >the followup patch for arch/x86/mm/fault.c.
> 
> Johannes,
> 
> today I tested both of your patches but problem with unremovable
> cgroups, unfortunately, persists.

Is the group empty again with marked under_oom?
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/