lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161129162515.GD9796@dhcp22.suse.cz>
Date:   Tue, 29 Nov 2016 17:25:15 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Stable tree <stable@...r.kernel.org>
Cc:     Vlastimil Babka <vbabka@...e.cz>, Marc MERLIN <marc@...lins.org>,
        linux-mm@...ck.org, Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>, Tejun Heo <tj@...nel.org>
Subject: Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of
 RAM that should be free

On Tue 22-11-16 17:38:01, Greg KH wrote:
> On Tue, Nov 22, 2016 at 05:14:02PM +0100, Vlastimil Babka wrote:
> > On 11/22/2016 05:06 PM, Marc MERLIN wrote:
> > > On Mon, Nov 21, 2016 at 01:56:39PM -0800, Marc MERLIN wrote:
> > >> On Mon, Nov 21, 2016 at 10:50:20PM +0100, Vlastimil Babka wrote:
> > >>>> 4.9rc5 however seems to be doing better, and is still running after 18
> > >>>> hours. However, I got a few page allocation failures as per below, but the
> > >>>> system seems to recover.
> > >>>> Vlastimil, do you want me to continue the copy on 4.9 (may take 3-5 days) 
> > >>>> or is that good enough, and i should go back to 4.8.8 with that patch applied?
> > >>>> https://marc.info/?l=linux-mm&m=147423605024993
> > >>>
> > >>> Hi, I think it's enough for 4.9 for now and I would appreciate trying
> > >>> 4.8 with that patch, yeah.
> > >>
> > >> So the good news is that it's been running for almost 5H and so far so good.
> > > 
> > > And the better news is that the copy is still going strong, 4.4TB and
> > > going. So 4.8.8 is fixed with that one single patch as far as I'm
> > > concerned.
> > > 
> > > So thanks for that, looks good to me to merge.
> > 
> > Thanks a lot for the testing. So what do we do now about 4.8? (4.7 is
> > already EOL AFAICS).
> > 
> > - send the patch [1] as 4.8-only stable. Greg won't like that, I expect.
> >   - alternatively a simpler (againm 4.8-only) patch that just outright
> > prevents OOM for 0 < order < costly, as Michal already suggested.
> > - backport 10+ compaction patches to 4.8 stable
> > - something else?
> 
> Just wait for 4.8-stable to go end-of-life in a few weeks after 4.9 is
> released?  :)

OK, so can we push this through to 4.8 before EOL and make sure there
won't be any additional pre-mature high order OOM reports? The patch
should be simple enough and safe for the stable tree. There is no
upstream commit because 4.9 is fixed in a different way which would be
way too intrusive for the stable backport.
--- 
>From 02306e8d593fa8a48d620e0c9d63a934ca8366d8 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@...e.com>
Date: Wed, 23 Nov 2016 07:26:30 +0100
Subject: [PATCH] mm, oom: stop pre-mature high-order OOM killer invocations

31e49bfda184 ("mm, oom: protect !costly allocations some more for
!CONFIG_COMPACTION") was an attempt to reduce chances of pre-mature OOM
killer invocation for high order requests. It seemed to work for most
users just fine but it is far from bullet proof and obviously not
sufficient for Marc who has reported pre-mature OOM killer invocations
with 4.8 based kernels. 4.9 will all the compaction improvements seems
to be behaving much better but that would be too intrusive to backport
to 4.8 stable kernels. Instead this patch simply never declares OOM for
!costly high order requests. We rely on order-0 requests to do that in
case we are really out of memory. Order-0 requests are much more common
and so a risk of a livelock without any way forward is highly unlikely.

Reported-by: Marc MERLIN <marc@...lins.org>
Tested-by: Marc MERLIN <marc@...lins.org>
Signed-off-by: Michal Hocko <mhocko@...e.com>
---
 mm/page_alloc.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a2214c64ed3c..7401e996009a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3161,6 +3161,16 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla
 	if (!order || order > PAGE_ALLOC_COSTLY_ORDER)
 		return false;
 
+#ifdef CONFIG_COMPACTION
+	/*
+	 * This is a gross workaround to compensate a lack of reliable compaction
+	 * operation. We cannot simply go OOM with the current state of the compaction
+	 * code because this can lead to pre mature OOM declaration.
+	 */
+	if (order <= PAGE_ALLOC_COSTLY_ORDER)
+		return true;
+#endif
+
 	/*
 	 * There are setups with compaction disabled which would prefer to loop
 	 * inside the allocator rather than hit the oom killer prematurely.
-- 
2.10.2

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ