lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210903015210.GF545073@minyard.net>
Date:   Thu, 2 Sep 2021 20:52:10 -0500
From:   Corey Minyard <minyard@....org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     cminyard@...sta.com, Michal Hocko <mhocko@...e.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] oom_kill: oom_score_adj broken for processes with small
 memory usage

On Thu, Sep 02, 2021 at 12:55:01PM -0700, Andrew Morton wrote:
> On Fri, 16 Jul 2021 07:25:47 -0500 Corey Minyard <cminyard@...sta.com> wrote:
> 
> > On Fri, Jul 16, 2021 at 07:19:24AM +0200, Michal Hocko wrote:
> > > On Thu 01-07-21 07:54:30, minyard@....org wrote:
> > > > From: Corey Minyard <cminyard@...sta.com>
> > > > 
> > > > If you have a process with less than 1000 totalpages, the calculation:
> > > > 
> > > >   adj = (long)p->signal->oom_score_adj;
> > > >   ...
> > > >   adj *= totalpages / 1000;
> > > > 
> > > > will always result in adj being zero no matter what oom_score_adj is,
> > > > which could result in the wrong process being picked for killing.
> > > > 
> > > > Fix by adding 1000 to totalpages before dividing.
> > > 
> > > Yes, this is a known limitation of the oom_score_adj and its scale.
> > > Is this a practical problem to be solved though? I mean 0-1000 pages is
> > > not really that much different from imprecision at a larger scale where
> > > tasks are effectively considered equal.
> > 
> > Known limitation?  Is this documented?  I couldn't find anything that
> > said "oom_score_adj doesn't work at all with programs with <1000 pages
> > besides setting the value to -1000".
> > 
> > > 
> > > I have to say I do not really like the proposed workaround. It doesn't
> > > really solve the problem yet it adds another special case.
> > 
> > The problem is that if you have a small program, there is no way to
> > set it's priority besides completely disablling the OOM killer for
> > it.
> > 
> > I don't understand the special case comment.  How is this adding a
> > special case?  This patch removes a special case.  Small programs
> > working different than big programs is a special case.  Making them all
> > work the same is removing an element of surprise from someone expecting
> > things to work as documented.
> > 
> 
> Can we please get this resolved one way or the other?

My goal in submitting this is to avoid someone having to go through what
I went through.  I know it now, so it's not going to affect me again.

We could document this, but to me it seems silly when something can just
be made consistent to avoid having to document it.  I got no response to
my questions above, so I don't know what to make of it.

Thanks Andrew,

-corey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ