[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bko7h13p.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Tue, 13 Dec 2022 21:42:02 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Michal Hocko <mhocko@...e.com>,
Mina Almasry <almasrymina@...gle.com>, weixugc@...gle.com
Cc: Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Jonathan Corbet <corbet@....net>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeelb@...gle.com>,
Muchun Song <songmuchun@...edance.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Yosry Ahmed <yosryahmed@...gle.com>, fvdl@...gle.com,
bagasdotme@...il.com, cgroups@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH v3] mm: Add nodes= arg to memory.reclaim
Michal Hocko <mhocko@...e.com> writes:
> On Tue 13-12-22 14:30:57, Huang, Ying wrote:
>> Mina Almasry <almasrymina@...gle.com> writes:
> [...]
>> After these discussion, I think the solution maybe use different
>> interfaces for "proactive demote" and "proactive reclaim". That is,
>> reconsider "memory.demote". In this way, we will always uncharge the
>> cgroup for "memory.reclaim". This avoid the possible confusion there.
>> And, because demotion is considered aging, we don't need to disable
>> demotion for "memory.reclaim", just don't count it.
>
> As already pointed out in my previous email, we should really think more
> about future requirements. Do we add memory.promote interface when there
> is a request to implement numa balancing into the userspace? Maybe yes
> but maybe the node balancing should be more generic than bound to memory
> tiering and apply to a more fine grained nodemask control.
>
> Fundamentally we already have APIs to age (MADV_COLD, MADV_FREE),
> reclaim (MADV_PAGEOUT, MADV_DONTNEED) and MADV_WILLNEED to prioritize
> (swap in, or read ahead) which are per mm/file. Their primary usability
> issue is that they are process centric and that requires a very deep
> understanding of the process mm layout so it is not really usable for a
> larger scale orchestration.
> The important part of those interfaces is that they do not talk about
> demotion because that is an implementation detail. I think we want to
> follow that model at least. From a higher level POV I believe we really
> need an interface to age&reclaim and balance memory among nodes. Are
> there more higher level usecases?
Yes. If the high level interface can satisfy the requirements, we
should use them or define them. But I guess Mina and Xu has some
requirements at the level of memory tiers (demotion/promotion)?
Best Regards,
Huang, Ying
Powered by blists - more mailing lists