linux-kernel - Re: [PATCH 09/12] oom: remove PF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100604100416.GB8569@redhat.com>
Date:	Fri, 4 Jun 2010 12:04:16 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	David Rientjes <rientjes@...gle.com>
Cc:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	"Luis Claudio R. Goncalves" <lclaudio@...g.org>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Nick Piggin <npiggin@...e.de>
Subject: Re: [PATCH 09/12] oom: remove PF_EXITING check completely

On 06/03, David Rientjes wrote:
>
> On Fri, 4 Jun 2010, Oleg Nesterov wrote:
>
> > > > > > Currently, PF_EXITING check is completely broken. because 1) It only
> > > > > > care main-thread and ignore sub-threads
> > > > >
> > > > > Then check the subthreads.
> > > > >
> > >
> > > Did you want to respond to this?
> >
> > Please explain what you mean. There were already a lot of discussions
> > about mt issues, I do not know what you have in mind.
>
> Can you check the subthreads to see if they are not PF_EXITING?

To detect the process with the dead group leader?

Yes, we can. We already discussed this. Probably it is better to check
PF_EXITING and signal_group_exit().

> > > I'm guessing at the relevancy here because the changelog is extremely
> > > poorly worded (if I were Andrew I would have no idea how important this
> > > patch is based on the description other than the alarmist words of "... is
> > > completely broken)", but if we're concerned about the coredumper not being
> > > able to find adequate resources to allocate memory from, we can give it
> > > access to reserves specifically,
> >
> > I don't think so. If oom-kill wants to kill the task which dumps the
> > code, it should stop the coredumping and exit.
>
> That's a coredump change, not an oom killer change.

Yes. do_coredump() should be fixed. This is not trivial (and needs the
subtle changes outside of fs/exec.c), we are looking for the simple fix
for now.

> If the coredumper
> needs memory and runs into the oom killer, this PF_EXITING check, which
> you want to remove, gives it access to memory reserves by setting
> TIF_MEMDIE so it can quickly finish and die.  This allows it to exit
> without oom killing anything else because the tasklist scan in the oom
> killer is not preempted by finding a TIF_MEMDIE task.

David, sorry. I already tried to explain (at least twice) that TIF_MEMDIE
(or SIGKILL even if do_coredump() was interruptible) can not help unless
you find the right thread, this is not trivial even if we forget about
CLONE_VM tasks.

And personally I disagree that it should use memory reserves, but this
doesn't matter.


Let's stop this. You shouldn't convince me. I am not the author of this
patch, and I said many times that I do not pretend I understand oom-kill
needs. I jumped into this discussion because your initial objection
(fatal_signal_pending() should fix the problems) was technically wrong.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/