linux-kernel - Re: [RFC 0/4] Introduce unbalance proactive reclaim

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZUziy-6QPdTIDJlm@tiehlicka>
Date:   Thu, 9 Nov 2023 14:46:51 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Huan Yang <link@...o.com>
Cc:     "Huang, Ying" <ying.huang@...el.com>, Tejun Heo <tj@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        Peter Xu <peterx@...hat.com>,
        "Vishal Moola (Oracle)" <vishal.moola@...il.com>,
        Yosry Ahmed <yosryahmed@...gle.com>,
        Liu Shixin <liushixin2@...wei.com>,
        Hugh Dickins <hughd@...gle.com>, cgroups@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, opensource.kernel@...o.com
Subject: Re: [RFC 0/4] Introduce unbalance proactive reclaim

On Thu 09-11-23 21:07:29, Huan Yang wrote:
[...]
> > > Of course, as you suggested, madvise can also achieve this, but
> > > implementing it in the agent may be more complex.(In terms of
> > > achieving the same goal, using memcg to group all the processes of an
> > > APP and perform proactive reclamation is simpler than using madvise
> > > and scanning multiple processes of an application using an agent?)
> > It might be more involved but the primary question is whether it is
> > usable for the specific use case. Madvise interface is not LRU aware but
> > you are not really talking about that to be a requirement? So it would
> > really help if you go deeper into details on how is the interface
> > actually supposed to be used in your case.
> In mobile field, we usually configure zram to compress anonymous page.
> We can approximate to expand memory usage with limited hardware memory
> by using zram.
> 
> With proper strategies, an 8GB RAM phone can approximate the usage of a 12GB
> phone
> (or more).
> 
> In our strategy, we group memcg by application. When the agent detects that
> an
> application has entered the background, then frozen, and has not been used
> for a long time,
> the agent will slowly issue commands to reclaim the anonymous page of that
> application.
> 
> With this interface, `echo memory anon > memory.reclaim`

This doesn't really answer my questions above.
  
> > Also make sure to exaplain why you cannot use other existing interfaces.
> > For example, why you simply don't decrease the limit of the frozen
> > cgroup and rely on the normal reclaim process to evict the most cold
> This is a question of reclamation tendency, and simply decreasing the limit
> of the frozen cgroup cannot achieve this.

Why?

> > memory? What are you basing your anon vs. file proportion decision on?
> When zram is configured and anonymous pages are reclaimed proactively, the
> refault
> probability of anonymous pages is low when an application is frozen and not
> reopened.
> Also, the cost of refaulting from zram is relatively low.
>
> However, file pages usually have shared properties, so even if an
> application is frozen,
> other processes may still access the file pages. If a limit is set and the
> reclamation encounters
> file pages, it will cause a certain amount of refault I/O, which is costly
> for mobile devices.

Two points here (and the reason why I am repeatedly asking for some
data) 1) are you really seeing shared and actively used page cache pages
being reclaimed? 2) Is the refault IO really a problem. What kind of
storage those phone have that this is more significant than potentially
GB of compressed anonymous memory which would need CPU to refaulted
back. I mean do you have any actual numbers to show that the default
reclaim strategy would lead to a less utilized or less performant
system?
-- 
Michal Hocko
SUSE Labs