linux-kernel - Re: [PATCH] oom_kill: oom_score_adj broken for processes with small memory usage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YTHTnvSh7WRNHG/n@dhcp22.suse.cz>
Date:   Fri, 3 Sep 2021 09:49:50 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     cminyard@...sta.com, minyard@....org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] oom_kill: oom_score_adj broken for processes with small
 memory usage

On Thu 02-09-21 12:55:01, Andrew Morton wrote:
> On Fri, 16 Jul 2021 07:25:47 -0500 Corey Minyard <cminyard@...sta.com> wrote:
> 
> > On Fri, Jul 16, 2021 at 07:19:24AM +0200, Michal Hocko wrote:
> > > On Thu 01-07-21 07:54:30, minyard@....org wrote:
> > > > From: Corey Minyard <cminyard@...sta.com>
> > > > 
> > > > If you have a process with less than 1000 totalpages, the calculation:
> > > > 
> > > >   adj = (long)p->signal->oom_score_adj;
> > > >   ...
> > > >   adj *= totalpages / 1000;
> > > > 
> > > > will always result in adj being zero no matter what oom_score_adj is,
> > > > which could result in the wrong process being picked for killing.
> > > > 
> > > > Fix by adding 1000 to totalpages before dividing.
> > > 
> > > Yes, this is a known limitation of the oom_score_adj and its scale.
> > > Is this a practical problem to be solved though? I mean 0-1000 pages is
> > > not really that much different from imprecision at a larger scale where
> > > tasks are effectively considered equal.
> > 
> > Known limitation?  Is this documented?  I couldn't find anything that
> > said "oom_score_adj doesn't work at all with programs with <1000 pages
> > besides setting the value to -1000".
> > 
> > > 
> > > I have to say I do not really like the proposed workaround. It doesn't
> > > really solve the problem yet it adds another special case.
> > 
> > The problem is that if you have a small program, there is no way to
> > set it's priority besides completely disablling the OOM killer for
> > it.
> > 
> > I don't understand the special case comment.  How is this adding a
> > special case?  This patch removes a special case.  Small programs
> > working different than big programs is a special case.  Making them all
> > work the same is removing an element of surprise from someone expecting
> > things to work as documented.
> > 
> 
> Can we please get this resolved one way or the other?

As I've already said, I do not see this practical enough problem to
warrant special treatment. Do we really care about controlling the oom
behavior for tasks with <4MB of memory?

I fully agree that the current situation is not ideal. The whole
oom_score* API sucks but here we are with an user API that is
effectivelly impossible to fix properly.

-- 
Michal Hocko
SUSE Labs