lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20151217131356.83d920b7c250a785aa132139@linux-foundation.org>
Date:	Thu, 17 Dec 2015 13:13:56 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	Mel Gorman <mgorman@...e.de>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	David Rientjes <rientjes@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Hugh Dickins <hughd@...gle.com>,
	Andrea Argangeli <andrea@...nel.org>,
	Rik van Riel <riel@...hat.com>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] mm, oom: introduce oom reaper

On Thu, 17 Dec 2015 14:02:24 +0100 Michal Hocko <mhocko@...nel.org> wrote:

> > I guess it means that the __oom_reap_vmas() success rate is nice anud
> > high ;)
> 
> I had a debugging trace_printks around this and there were no reties
> during my testing so I was probably lucky to not trigger the mmap_sem
> contention.
> ---
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 48025a21f8c4..f53f87cfd899 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -469,7 +469,7 @@ static void oom_reap_vmas(struct mm_struct *mm)
>  	int attempts = 0;
>  
>  	while (attempts++ < 10 && !__oom_reap_vmas(mm))
> -		schedule_timeout(HZ/10);
> +		msleep_interruptible(100);
>  
>  	/* Drop a reference taken by wake_oom_reaper */
>  	mmdrop(mm);

Timeliness matter here.  Over on the other CPU, direct reclaim is
pounding along, on its way to declaring oom.  Sometimes the oom_reaper
thread will end up scavenging memory on behalf of a caller who gave up
a long time ago.  But we shouldn't atempt to "fix" that unless we can
demonstrate that it's a problem.


Also, re-reading your description:

: It has been shown (e.g.  by Tetsuo Handa) that it is not that hard to
: construct workloads which break the core assumption mentioned above and
: the OOM victim might take unbounded amount of time to exit because it
: might be blocked in the uninterruptible state waiting for on an event
: (e.g.  lock) which is blocked by another task looping in the page
: allocator.

So the allocating task has done an oom-kill and is waiting for memory
to become available.  The killed task is stuck on some lock, unable to
free memory.

But the problematic lock will sometimes be the killed tasks's mmap_sem,
so the reaper won't reap anything.  This scenario requires that the
mmap_sem is held for writing, which sounds like it will be uncommon. 
hm.  sigh.  I hate the oom-killer.  Just buy some more memory already!


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ