lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 13 Jul 2016 10:08:58 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Shayan Pooya <shayan@...eve.org>
Cc:	Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
	koct9i@...il.com, cgroups mailinglist <cgroups@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: bug in memcg oom-killer results in a hung syscall in another
 process in the same cgroup

On Tue 12-07-16 08:35:06, Shayan Pooya wrote:
> >> With strace, when running 500 concurrent mem-hog tasks on the same
> >> kernel, 33 of them failed with:
> >>
> >> strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
> >> `THREAD_GETMEM (self, tid) != ppid' failed.
> >>
> >> Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
> >> And discussed before at: https://lkml.org/lkml/2015/2/6/470 but that
> >> patch was not accepted.
> >
> > OK, so the problem is that the oom killed task doesn't report the futex
> > release properly? If yes then I fail to see how that is memcg specific.
> > Could you try to clarify what you consider a bug again, please? I am not
> > really sure I understand this report.
> 
> It looks like it is just a very easy way to reproduce the problem that
> Konstantin described in that lkml thread. That patch was not accepted
> and I see no other fixes for that issue upstream. Here is a copy of
> his root-cause analysis from said thread:
> 
> Whole sequence looks like: task calls fork, glibc calls syscall clone with
> CLONE_CHILD_SETTID and passes pointer to TLS THREAD_SELF->tid as argument.
> Child task gets read-only copy of VM including TLS. Child calls put_user()
> to handle CLONE_CHILD_SETTID from schedule_tail(). put_user() trigger page
> fault and it fails because do_wp_page()  hits memcg limit without invoking
> OOM-killer because this is page-fault from kernel-space.  Put_user returns
> -EFAULT, which is ignored.  Child returns into user-space and catches here
> assert (THREAD_GETMEM (self, tid) != ppid), glibc tries to print something
> but hangs on deadlock on internal locks. Halt and catch fire.

OK, I see! Thanks for the clarification. So the bug is that put_user
return value is ignored. Let's see whether Konstantin's patch will be
accepted or Oleg comes with something else.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ