linux-kernel - Re: process hangs on do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKWKT+ZRMHzgCLJ1quGnw-_T1b9OboYKnQdRc2_Z=rdU_PFVtw@mail.gmail.com>
Date:	Tue, 23 Oct 2012 11:35:52 +0800
From:	Qiang Gao <gaoqiangscut@...il.com>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
	"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
	linux-mm@...ck.org, bsingharora@...il.com
Subject: Re: process hangs on do_exit when oom happens

information about the system is in the attach file "information.txt"

I can not reproduce it in the upstream 3.6.0 kernel..

On Sat, Oct 20, 2012 at 12:04 AM, Michal Hocko <mhocko@...e.cz> wrote:
> On Wed 17-10-12 18:23:34, gaoqiang wrote:
>> I looked up nothing useful with google,so I'm here for help..
>>
>> when this happens:  I use memcg to limit the memory use of a
>> process,and when the memcg cgroup was out of memory,
>> the process was oom-killed   however,it cannot really complete the
>> exiting. here is the some information
>
> How many tasks are in the group and what kind of memory do they use?
> Is it possible that you were hit by the same issue as described in
> 79dfdacc memcg: make oom_lock 0 and 1 based rather than counter.
>
>> OS version:  centos6.2    2.6.32.220.7.1
>
> Your kernel is quite old and you should be probably asking your
> distribution to help you out. There were many fixes since 2.6.32.
> Are you able to reproduce the same issue with the current vanila kernel?
>
>> /proc/pid/stack
>> ---------------------------------------------------------------
>>
>> [<ffffffff810597ca>] __cond_resched+0x2a/0x40
>> [<ffffffff81121569>] unmap_vmas+0xb49/0xb70
>> [<ffffffff8112822e>] exit_mmap+0x7e/0x140
>> [<ffffffff8105b078>] mmput+0x58/0x110
>> [<ffffffff81061aad>] exit_mm+0x11d/0x160
>> [<ffffffff81061c9d>] do_exit+0x1ad/0x860
>> [<ffffffff81062391>] do_group_exit+0x41/0xb0
>> [<ffffffff81077cd8>] get_signal_to_deliver+0x1e8/0x430
>> [<ffffffff8100a4c4>] do_notify_resume+0xf4/0x8b0
>> [<ffffffff8100b281>] int_signal+0x12/0x17
>> [<ffffffffffffffff>] 0xffffffffffffffff
>
> This looks strange because this is just an exit part which shouldn't
> deadlock or anything. Is this stack stable? Have you tried to take check
> it more times?
>
> --
> Michal Hocko
> SUSE Labs

View attachment "information.txt" of type "text/plain" (118561 bytes)