[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200709095840.GE19160@dhcp22.suse.cz>
Date: Thu, 9 Jul 2020 11:58:40 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Yafang Shao <laoar.shao@...il.com>
Cc: Jonathan Corbet <corbet@....net>,
Andrew Morton <akpm@...ux-foundation.org>,
David Rientjes <rientjes@...gle.com>,
Linux MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] doc, mm: clarify /proc/<pid>/oom_score value range
On Thu 09-07-20 17:01:06, Yafang Shao wrote:
> On Thu, Jul 9, 2020 at 4:18 PM Michal Hocko <mhocko@...nel.org> wrote:
> >
> > On Thu 09-07-20 15:41:11, Yafang Shao wrote:
> > > On Thu, Jul 9, 2020 at 2:26 PM Michal Hocko <mhocko@...nel.org> wrote:
> > > >
> > > > From: Michal Hocko <mhocko@...e.com>
> > > >
> > > > The exported value includes oom_score_adj so the range is no [0, 1000]
> > > > as described in the previous section but rather [0, 2000]. Mention that
> > > > fact explicitly.
> > > >
> > > > Signed-off-by: Michal Hocko <mhocko@...e.com>
> > > > ---
> > > > Documentation/filesystems/proc.rst | 3 +++
> > > > 1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > > > index 8e3b5dffcfa8..78a0dec323a3 100644
> > > > --- a/Documentation/filesystems/proc.rst
> > > > +++ b/Documentation/filesystems/proc.rst
> > > > @@ -1673,6 +1673,9 @@ requires CAP_SYS_RESOURCE.
> > > > 3.2 /proc/<pid>/oom_score - Display current oom-killer score
> > > > -------------------------------------------------------------
> > > >
> > > > +Please note that the exported value includes oom_score_adj so it is effectively
> > > > +in range [0,2000].
> > > > +
> > >
> > > [0, 2000] may be not a proper range, see my reply in another thread.[1]
> > > As this value hasn't been documented before and nobody notices that, I
> > > think there might be no user really care about it before.
> > > So we should discuss the proper range if we really think the user will
> > > care about this value.
> >
> > Even if we decide the range should change, I do not really assume this
> > will happen, it is good to have the existing behavior clarified.
> >
>
> But the existing behavior is not defined in the kernel documentation
> before, so I don't think that the user has a clear understanding of
> the existing behavior.
Well, documentation is by no means authoritative, especially when it is
outdated or incomplete. What really matters is the observed behavior and
a lot of userspace depends on that or based on the specific
implementation.
> The way to use the result of proc_oom_score is to compare which
> processes will be killed first by the OOM killer, IOW, the user should
> always use it to compare different processes. For example,
>
> if proc_oom_score(process_a) > proc_oom_score(process_b)
> then
> process_a will be killed before process_b
> fi
>
> And then the user will "Use it together with
> /proc/<pid>/oom_score_adj to tune which
> process should be killed in an out-of-memory situation."
>
> That means what the user really cares about is the relative value, and
> they will not care about the range or the absolute value.
In an ideal world yes. But the real life tells a different story. Many
times userspace (ab)uses certain undocumented/unintended (mis)features
and the hard rule is that we never break userspace. We've learned that
through many painful historical experiences. Especially vaguely defined
functionality suffers from the problem.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists