linux-kernel - Re: [PATCH 1/2] oom: don't assume that a coredumping thread will exit soon

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20141202183130.GM27014@dhcp22.suse.cz>
Date:	Tue, 2 Dec 2014 19:31:30 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Cong Wang <xiyou.wangcong@...il.com>,
	David Rientjes <rientjes@...gle.com>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>, Tejun Heo <tj@...nel.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] oom: don't assume that a coredumping thread will
 exit soon

On Tue 02-12-14 18:50:41, Oleg Nesterov wrote:
> On 12/02, Michal Hocko wrote:
> >
> > On Fri 28-11-14 00:04:05, Oleg Nesterov wrote:
> >
> > > Note: this is only the first step, this patch doesn't try to solve other
> > > problems. For example it doesn't try to clear the wrongly set TIF_MEMDIE
> > > (SIGNAL_GROUP_COREDUMP check is obviously racy),
> >
> > I am not sure I understand this. What do you mean by wrongly set
> > TIF_MEMDIE? That we give a process access to reserves even though it is
> > already done with the coredumping?
> 
> I meant that (say) oom_kill_process() can set TIF_MEMDIE because
> PF_EXITING && !SIGNAL_GROUP_COREDUMP, and after that this task can
> participate the coredumping. For example, this thread can exit on its
> own, but before it calls exit_mm() another thread can start the coredump.
> 
> In this case TIF_MEMDIE can fool oom-killer the same way,
> oom_scan_process_thread() returns OOM_SCAN_ABORT if TIF_MEMDIE is set.
> 
> > > fatal_signal_pending() can be false positive, etc.
> >
> > When can this happen?
> 
> I meant "if (fatal_signal_pending(current) || task_will_free_mem(current))"
> in out_of_memory(). Yes, sorry, "false positive" looks confusing. I meant
> that fatal_signal_pending() can be true because of SIGNAL_GROUP_COREDUMP.

Ahh, OK I guess I see what you meant.
 
> > > Signed-off-by: Oleg Nesterov <oleg@...hat.com>
> >
> > I guess the patch as is makes sense and it is an improvement. We need
> > to call the helper in mem_cgroup_out_of_memory as well, though.
> 
> Yes, but can't we do this in a separate patch?

I would prefer if it was in the same patch because we might be facing
the same problem in memcg as with the global case. And worse, smaller
limit tend to trigger corner cases more often than the global case.

> try_charge() plays with TIF_MEMDIE/PF_EXITING too, but probably this
> is fine.

try_charge is OK because this is from the time when the allocation has
been already done and we just decide to bypass the charge.

> > With that feel free to add
> > Acked-by: Michal Hocko <mhocko@...e.cz>
> 
> Thanks.
> 
> > Also the original fix for the coredumping (edd45544c6f0 "oom: avoid
> > deferring oom killer if exiting task is being traced") doesn't work
> > really as per http://marc.info/?l=linux-kernel&m=141711049013620 then
> > this and the follow up patch should be marked for stable I guess.
> 
> Perhaps this makes sense. It looks simple enough.
> 
> Oleg.
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/