[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100614163304.GA21313@redhat.com>
Date: Mon, 14 Jun 2010 18:33:05 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Roland McGrath <roland@...hat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
LKML <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
David Rientjes <rientjes@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Nick Piggin <npiggin@...e.de>
Subject: Re: uninterruptible CLONE_VFORK (Was: oom: Make coredump
interruptible)
On 06/13, Roland McGrath wrote:
>
> > Oh. And another problem, vfork() is not interruptible too. This means
> > that the user can hide the memory hog from oom-killer.
>
> I'm not sure there is really any danger like that, because of the
> oom_kill_process "Try to kill a child first" logic.
But note that oom_kill_process() doesn't kill the children with the
same ->mm. I never understood this code.
Anyway I agree. Even if I am right, this is not very serious problem
from oom-kill pov. To me, the uninterruptible CLONE_VFORK is bad by
itself.
> > But let's forget about oom.
>
> Sure, but it reminds me to mention that vfork mm sharing is another reason
> that having oom_kill set some persistent state in the mm seems wrong.
Yes, yes, this was already discussed a bit. Only if the core dump is in
progress we can touch ->mm or (probably better but needs a bit more locking)
mm->core_state to signal the coredumping thread and (perhaps) for something
else.
> > Roland, any reason it should be uninterruptible? This doesn't look good
> > in any case. Perhaps the pseudo-patch below makes sense?
>
> I've long thought that we should make a vfork parent SIGKILL-able.
Good ;)
> (Of
> course the vfork wait can't be made interruptible by other signals, since
> it must never do anything userish
Yes sure. That is why wait_for_completion_killable(), not _interrutpible.
But I assume you didn't mean that only SIGKILL should interrupt the
parent, any sig_fatal() signal should.
> I don't know off hand of any problem with your
> straightforward change. But I don't have much confidence that there isn't
> any strange gotcha waiting there due to some other kind of implicit
> assumption about vfork parent blocks that we are overlooking at the moment.
> So I wouldn't change this without more thorough auditing and thinking about
> everything related to vfork.
Agreed. This needs auditing. And CLONE_VFORK can be used with/without all
other CLONE_ flags... Probably we should mostly worry about vfork ==
CLONE_VM | CLONE_VFORK case.
Anyway. ->vfork_done is per-thread. This means that without any changes
do_fork(CLONE_VFORK) can return (to user-mode) before the child's thread
group exits/execs. Perhaps this means we shouldn't worry too much.
> Personally, what I've really been interested in is changing the vfork wait
> to use some different kind of blocking entirely. My real motivation for
> that is to let a vfork wait be morphed into and out of TASK_TRACED,
I see. I never thought about this, but I think you are right.
Hmm. Even without debugger, the parent doesn't react to SIGSTOP. Say,
int main(voif)
{
if (!vfork())
pause();
}
and ^Z won't work obviously. Not good.
This is not trivail I guess. Needs thinking...
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists