[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YHkw8Ou2VAgHYTjl@dhcp22.suse.cz>
Date: Fri, 16 Apr 2021 08:38:40 +0200
From: Michal Hocko <mhocko@...e.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Shakeel Butt <shakeelb@...gle.com>, Yang Shi <shy828301@...il.com>,
Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Ying Huang <ying.huang@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
David Rientjes <rientjes@...gle.com>,
Linux MM <linux-mm@...ck.org>,
Cgroups <cgroups@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered
memory
On Thu 15-04-21 15:31:46, Tim Chen wrote:
>
>
> On 4/9/21 12:24 AM, Michal Hocko wrote:
> > On Thu 08-04-21 13:29:08, Shakeel Butt wrote:
> >> On Thu, Apr 8, 2021 at 11:01 AM Yang Shi <shy828301@...il.com> wrote:
> > [...]
> >>> The low priority jobs should be able to be restricted by cpuset, for
> >>> example, just keep them on second tier memory nodes. Then all the
> >>> above problems are gone.
> >
> > Yes, if the aim is to isolate some users from certain numa node then
> > cpuset is a good fit but as Shakeel says this is very likely not what
> > this work is aiming for.
> >
> >> Yes that's an extreme way to overcome the issue but we can do less
> >> extreme by just (hard) limiting the top tier usage of low priority
> >> jobs.
> >
> > Per numa node high/hard limit would help with a more fine grained control.
> > The configuration would be tricky though. All low priority memcgs would
> > have to be carefully configured to leave enough for your important
> > processes. That includes also memory which is not accounted to any
> > memcg.
> > The behavior of those limits would be quite tricky for OOM situations
> > as well due to a lack of NUMA aware oom killer.
> >
>
> Another downside of putting limits on individual NUMA
> node is it would limit flexibility.
Let me just clarify one thing. I haven't been proposing per NUMA limits.
As I've said above it would be quite tricky to use and the behavior
would be tricky as well. All I am saying is that we do not want to have
an interface that is tightly bound to any specific HW setup (fast RAM as
a top tier and PMEM as a fallback) that you have proposed here. We want
to have a generic NUMA based abstraction. How that abstraction is going
to look like is an open question and it really depends on usecase that
we expect to see.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists