lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKWKT+ZRTUwer8qhjWGjkra63e10R67UQzezdaCaStz+rvGjxw@mail.gmail.com>
Date:	Fri, 26 Oct 2012 10:42:37 +0800
From:	Qiang Gao <gaoqiangscut@...il.com>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	Balbir Singh <bsingharora@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
	"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
	linux-mm@...ck.org
Subject: Re: process hangs on do_exit when oom happens

On Thu, Oct 25, 2012 at 5:57 PM, Michal Hocko <mhocko@...e.cz> wrote:
> On Wed 24-10-12 11:44:17, Qiang Gao wrote:
>> On Wed, Oct 24, 2012 at 1:43 AM, Balbir Singh <bsingharora@...il.com> wrote:
>> > On Tue, Oct 23, 2012 at 3:45 PM, Michal Hocko <mhocko@...e.cz> wrote:
>> >> On Tue 23-10-12 18:10:33, Qiang Gao wrote:
>> >>> On Tue, Oct 23, 2012 at 5:50 PM, Michal Hocko <mhocko@...e.cz> wrote:
>> >>> > On Tue 23-10-12 15:18:48, Qiang Gao wrote:
>> >>> >> This process was moved to RT-priority queue when global oom-killer
>> >>> >> happened to boost the recovery of the system..
>> >>> >
>> >>> > Who did that? oom killer doesn't boost the priority (scheduling class)
>> >>> > AFAIK.
>> >>> >
>> >>> >> but it wasn't get properily dealt with. I still have no idea why where
>> >>> >> the problem is ..
>> >>> >
>> >>> > Well your configuration says that there is no runtime reserved for the
>> >>> > group.
>> >>> > Please refer to Documentation/scheduler/sched-rt-group.txt for more
>> >>> > information.
>> >>> >
>> >> [...]
>> >>> maybe this is not a upstream-kernel bug. the centos/redhat kernel
>> >>> would boost the process to RT prio when the process was selected
>> >>> by oom-killer.
>> >>
>> >> This still looks like your cpu controller is misconfigured. Even if the
>> >> task is promoted to be realtime.
>> >
>> >
>> > Precisely! You need to have rt bandwidth enabled for RT tasks to run,
>> > as a workaround please give the groups some RT bandwidth and then work
>> > out the migration to RT and what should be the defaults on the distro.
>> >
>> > Balbir
>>
>>
>> see https://patchwork.kernel.org/patch/719411/
>
> The patch surely "fixes" your problem but the primary fault here is the
> mis-configured cpu cgroup. If the value for the bandwidth is zero by
> default then all realtime processes in the group a screwed. The value
> should be set to something more reasonable.
> I am not familiar with the cpu controller but it seems that
> alloc_rt_sched_group needs some treat. Care to look into it and send a
> patch to the cpu controller and cgroup maintainers, please?
>
> --
> Michal Hocko
> SUSE Labs

I'm trying to fix the problem. but no substantive progress yet.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ