[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YGwlGrHtDJPQF7UG@dhcp22.suse.cz>
Date: Tue, 6 Apr 2021 11:08:42 +0200
From: Michal Hocko <mhocko@...e.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Ying Huang <ying.huang@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
David Rientjes <rientjes@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered
memory
On Mon 05-04-21 10:08:24, Tim Chen wrote:
[...]
> To make fine grain cgroup based management of the precious top tier
> DRAM memory possible, this patchset adds a few new features:
> 1. Provides memory monitors on the amount of top tier memory used per cgroup
> and by the system as a whole.
> 2. Applies soft limits on the top tier memory each cgroup uses
> 3. Enables kswapd to demote top tier pages from cgroup with excess top
> tier memory usages.
Could you be more specific on how this interface is supposed to be used?
> This allows us to provision different amount of top tier memory to each
> cgroup according to the cgroup's latency need.
>
> The patchset is based on cgroup v1 interface. One shortcoming of the v1
> interface is the limit on the cgroup is a soft limit, so a cgroup can
> exceed the limit quite a bit before reclaim before page demotion reins
> it in.
I have to say that I dislike abusing soft limit reclaim for this. In the
past we have learned that the existing implementation is unfixable and
changing the existing semantic impossible due to backward compatibility.
So I would really prefer the soft limit just find its rest rather than
see new potential usecases.
I haven't really looked into details of this patchset but from a cursory
look it seems like you are actually introducing a NUMA aware limits into
memcg that would control consumption from some nodes differently than
other nodes. This would be rather alien concept to the existing memcg
infrastructure IMO. It looks like it is fusing borders between memcg and
cputset controllers.
You also seem to be basing the interface on the very specific usecase.
Can we expect that there will be many different tiers requiring their
own balancing?
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists