lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 7 Oct 2014 14:23:36 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	Cong Wang <xiyou.wangcong@...il.com>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	Greg KH <gregkh@...uxfoundation.org>,
	LKML <linux-kernel@...r.kernel.org>, stable@...r.kernel.org
Subject: Re: Please backport commit 3812c8c8f39 to stable

On Fri 03-10-14 11:03:30, Cong Wang wrote:
> On Fri, Oct 3, 2014 at 8:13 AM, Michal Hocko <mhocko@...e.cz> wrote:
> >
> > That commit fixes an OOM deadlock. Not a soft lockup. Do you have the
> > OOM killer report from the log? This would tell us that the killed task
> > was indeed sleeping on the lock which is hold by the charger which
> > triggered the OOM. I am little bit surprised that I do not see any OOM
> > related functions on the stacks (maybe the code is inlined...).
> 
> 
> Oh, did you see __mem_cgroup_try_charge() calls
> schedule_timeout_uninterruptible() in stack trace? Yes, they are inlined
> and I don't see any other possibilities for calling it.

Yes the only place we call schedule_timeout_uninterruptible from is
mem_cgroup_handle_oom. And it happens only for a task which hasn't been
killed by OOM killer.

> > It would be better to know what exactly is going on before backporting
> > this change because it is quite large.
> >
> 
> I thought the stack trace I showed is obvious. :) I am very happy
> to investigate if you see any other path calling
> schedule_timeout_uninterruptible()
> in __mem_cgroup_try_charge().

I was expecting an oom report which kills a task which is sleeping on a
lock which is held on the way up to the charge function. Your report
mentioned a task waiting for i_mutex for too long. It is true that the
charging path is holding an i_mutex as well so it might be the same
situation handled by the said patch. But it is not 100% clear this is
the case without an OOM report which would point to the waiting task.
The memcg might be trashing on the hard limit and reclaim might take a
long time.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists