lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150930134746.GB32263@redhat.com>
Date:	Wed, 30 Sep 2015 15:47:46 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	rientjes@...gle.com, akpm@...ux-foundation.org, kwalker@...hat.com,
	mhocko@...nel.org, skozina@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH -mm 1/3] mm/oom_kill: remove the wrong
	fatal_signal_pending()

On 09/30, Tetsuo Handa wrote:
>
> David Rientjes wrote:
> > On Tue, 29 Sep 2015, Oleg Nesterov wrote:
> >
> > > The fatal_signal_pending() was added to suppress unnecessary "sharing
> > > same memory" message, but it can't 100% help anyway because it can be
> > > false-negative; SIGKILL can be already dequeued.
> > >
> > > And worse, it can be false-positive due to exec or coredump. exec is
> > > mostly fine, but coredump is not. It is possible that the group leader
> > > has the pending SIGKILL because its sub-thread originated the coredump,
> > > in this case we must not skip this process.
> > >
> > > We could probably add the additional ->group_exit_task check but this
> > > pach just removes fatal_signal_pending(), the extra "Kill process" is
> > > unlikely and doesn't really hurt.
>
> This fatal_signal_pending() check is about to be added by me because the OOM
> killer spams the kernel log when the mm struct which the OOM victim is using
> is shared by many threads. ( http://marc.info/?l=linux-mm&m=143256441501204 )

OK, I see, but it is wrong.

But I don't really understand "shared by many threads", I mean "threads" is
confusing word. I guess you mean CLONE_VM processes, otherwise we shouldn't
see the additional spam.

And 1000 CLONE_VM processes + "and the lock dependency prevents all threads
except the OOM victim thread from terminating until they get TIF_MEMDIE flag"
look like a really pathological case...

> > In addition, I'm really debating whether we need the "sharing same memory"
> > line or not.  In the past, it has been helpful because there is no other
> > way to determine what the kernel has killed other than to leave an
> > artifact behind in the kernel log.  I can imagine that this could easily
> > spam the kernel log, though, accompanied by oom killer messages that are
> > already very verbose.  I wouldn't mind if it the printk were removed
> > entirely.
> >
>
> I was waiting for your comment about whether you depend on
> the "sharing same memory" message with KERN_ERR level.
> ( http://marc.info/?l=linux-mm&m=144120389203133 )
>
> If nobody else objects, I think we can remove the "sharing same memory"
> message. ( http://marc.info/?l=linux-mm&m=144119325831959 )

OK, will you agree with v2 which also removes pr_warn?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ