lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y5g9D12uEr7HpP5K@dhcp22.suse.cz>
Date:   Tue, 13 Dec 2022 09:51:27 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Mina Almasry <almasrymina@...gle.com>, Tejun Heo <tj@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        Yosry Ahmed <yosryahmed@...gle.com>, weixugc@...gle.com,
        fvdl@...gle.com, bagasdotme@...il.com, cgroups@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH v3] mm: Add nodes= arg to memory.reclaim

On Tue 13-12-22 14:30:57, Huang, Ying wrote:
> Mina Almasry <almasrymina@...gle.com> writes:
[...]
> After these discussion, I think the solution maybe use different
> interfaces for "proactive demote" and "proactive reclaim".  That is,
> reconsider "memory.demote".  In this way, we will always uncharge the
> cgroup for "memory.reclaim".  This avoid the possible confusion there.
> And, because demotion is considered aging, we don't need to disable
> demotion for "memory.reclaim", just don't count it.

As already pointed out in my previous email, we should really think more
about future requirements. Do we add memory.promote interface when there
is a request to implement numa balancing into the userspace? Maybe yes
but maybe the node balancing should be more generic than bound to memory
tiering and apply to a more fine grained nodemask control.

Fundamentally we already have APIs to age (MADV_COLD, MADV_FREE),
reclaim (MADV_PAGEOUT, MADV_DONTNEED) and MADV_WILLNEED to prioritize
(swap in, or read ahead) which are per mm/file. Their primary usability
issue is that they are process centric and that requires a very deep
understanding of the process mm layout so it is not really usable for a
larger scale orchestration.
The important part of those interfaces is that they do not talk about
demotion because that is an implementation detail. I think we want to
follow that model at least. From a higher level POV I believe we really
need an interface to age&reclaim and balance memory among nodes. Are
there more higher level usecases?
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ