lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161122162544.GG6831@dhcp22.suse.cz>
Date:   Tue, 22 Nov 2016 17:25:44 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     Marc MERLIN <marc@...lins.org>, linux-mm@...ck.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Tejun Heo <tj@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of
 RAM that should be free

On Tue 22-11-16 17:14:02, Vlastimil Babka wrote:
> On 11/22/2016 05:06 PM, Marc MERLIN wrote:
> > On Mon, Nov 21, 2016 at 01:56:39PM -0800, Marc MERLIN wrote:
> >> On Mon, Nov 21, 2016 at 10:50:20PM +0100, Vlastimil Babka wrote:
> >>>> 4.9rc5 however seems to be doing better, and is still running after 18
> >>>> hours. However, I got a few page allocation failures as per below, but the
> >>>> system seems to recover.
> >>>> Vlastimil, do you want me to continue the copy on 4.9 (may take 3-5 days) 
> >>>> or is that good enough, and i should go back to 4.8.8 with that patch applied?
> >>>> https://marc.info/?l=linux-mm&m=147423605024993
> >>>
> >>> Hi, I think it's enough for 4.9 for now and I would appreciate trying
> >>> 4.8 with that patch, yeah.
> >>
> >> So the good news is that it's been running for almost 5H and so far so good.
> > 
> > And the better news is that the copy is still going strong, 4.4TB and
> > going. So 4.8.8 is fixed with that one single patch as far as I'm
> > concerned.
> > 
> > So thanks for that, looks good to me to merge.
> 
> Thanks a lot for the testing. So what do we do now about 4.8? (4.7 is
> already EOL AFAICS).
> 
> - send the patch [1] as 4.8-only stable. Greg won't like that, I expect.
>   - alternatively a simpler (againm 4.8-only) patch that just outright
> prevents OOM for 0 < order < costly, as Michal already suggested.
> - backport 10+ compaction patches to 4.8 stable
> - something else?
> 
> Michal? Linus?

Dunno. To be honest I do not like [1] because it seriously tweaks the
retry logic. 10+ compaction patches to 4.8 seems too much for a stable
tree and quite risky as well. Considering that 4.9 works just much
better, is there any strong reason to do 4.8 specific fix at all? Most
users reporting OOM regressions seemed to be ok with what 4.8 does
currently AFAIR. I hate that Marc is not falling into that category but
is it really problem for you to run with 4.9? If we have more users
seeing this regression then I would rather go with a simpler 4.8-only
"never trigger OOM for order > 0 && order < costly because that would at
least have deterministic behavior.

> 
> [1] https://marc.info/?l=linux-mm&m=147423605024993

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ