lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878r76gsvz.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date:   Fri, 10 Nov 2023 12:00:00 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     Huan Yang <link@...o.com>
Cc:     Michal Hocko <mhocko@...e.com>, Tejun Heo <tj@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        "Jonathan Corbet" <corbet@....net>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        "Shakeel Butt" <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        "Andrew Morton" <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        Peter Xu <peterx@...hat.com>,
        "Vishal Moola (Oracle)" <vishal.moola@...il.com>,
        Yosry Ahmed <yosryahmed@...gle.com>,
        "Liu Shixin" <liushixin2@...wei.com>,
        Hugh Dickins <hughd@...gle.com>, <cgroups@...r.kernel.org>,
        <linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-mm@...ck.org>, <opensource.kernel@...o.com>
Subject: Re: [RFC 0/4] Introduce unbalance proactive reclaim

Huan Yang <link@...o.com> writes:

> 在 2023/11/10 9:19, Huang, Ying 写道:
>> [Some people who received this message don't often get email from ying.huang@...el.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>
>> Huan Yang <link@...o.com> writes:
>>
>>> 在 2023/11/9 18:39, Michal Hocko 写道:
>>>> [Some people who received this message don't often get email from mhocko@...e.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>>>
>>>> On Thu 09-11-23 18:29:03, Huan Yang wrote:
>>>>> HI Michal Hocko,
>>>>>
>>>>> Thanks for your suggestion.
>>>>>
>>>>> 在 2023/11/9 17:57, Michal Hocko 写道:
>>>>>> [Some people who received this message don't often get email from mhocko@...e.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>>>>>
>>>>>> On Thu 09-11-23 11:38:56, Huan Yang wrote:
>>>>>> [...]
>>>>>>>> If so, is it better only to reclaim private anonymous pages explicitly?
>>>>>>> Yes, in practice, we only proactively compress anonymous pages and do not
>>>>>>> want to touch file pages.
>>>>>> If that is the case and this is mostly application centric (which you
>>>>>> seem to be suggesting) then why don't you use madvise(MADV_PAGEOUT)
>>>>>> instead.
>>>>> Madvise  may not be applicable in this scenario.(IMO)
>>>>>
>>>>> This feature is aimed at a core goal, which is to compress the anonymous
>>>>> pages
>>>>> of frozen applications.
>>>>>
>>>>> How to detect that an application is frozen and determine which pages can be
>>>>> safely reclaimed is the responsibility of the policy part.
>>>>>
>>>>> Setting madvise for an application is an active behavior, while the above
>>>>> policy
>>>>> is a passive approach.(If I misunderstood, please let me know if there is a
>>>>> better
>>>>> way to set madvise.)
>>>> You are proposing an extension to the pro-active reclaim interface so
>>>> this is an active behavior pretty much by definition. So I am really not
>>>> following you here. Your agent can simply scan the address space of the
>>>> application it is going to "freeze" and call pidfd_madvise(MADV_PAGEOUT)
>>>> on the private memory is that is really what you want/need.
>>> There is a key point here. We want to use the grouping policy of memcg
>>> to perform
>>> proactive reclamation with certain tendencies. Your suggestion is to
>>> reclaim memory
>>> by scanning the task process space. However, in the mobile field,
>>> memory is usually
>>> viewed at the granularity of an APP.
>>>
>>> Therefore, after an APP is frozen, we hope to reclaim memory uniformly
>>> according
>>> to the pre-grouped APP processes.
>>>
>>> Of course, as you suggested, madvise can also achieve this, but
>>> implementing it in
>>> the agent may be more complex.(In terms of achieving the same goal,
>>> using memcg
>>> to group all the processes of an APP and perform proactive reclamation
>>> is simpler
>>> than using madvise and scanning multiple processes of an application
>>> using an agent?)
>> I still think that it's not too complex to use process_madvise() to do
>> this.  For each process of the application, the agent can read
>> /proc/PID/maps to get all anonymous address ranges, then call
>> process_madvise(MADV_PAGEOUT) to reclaim pages.  This can even filter
>> out shared anonymous pages.  Does this work for you?
>
> Thanks for this suggestion. This way can avoid touch shared anonymous, it's
> pretty well. But, I have some doubts about this, CPU resources are
> usually limited in
> embedded devices, and power consumption must also be taken into
> consideration.
>
> If this approach is adopted, the agent needs to periodically scan
> frozen applications
> and set pageout for the address space. Is the frequency of this active
> operation more
> complex and unsuitable for embedded devices compared to reclamation based on
> memcg grouping features?

In memcg based solution, when will you start the proactive reclaiming?
You can just replace the reclaiming part of the solution from memcg
proactive reclaiming to process_madvise(MADV_PAGEOUT).  Because you can
get PIDs in a memcg.  Is it possible?

> In addition, without LRU, it is difficult to control the reclamation
> of only partially cold
> anonymous page data of frozen applications. For example, if I only
> want to proactively
> reclaim 100MB of anonymous pages and issue the proactive reclamation
> interface,
> we can use the LRU feature to only reclaim 100MB of cold anonymous pages.
> However, this cannot be achieved through madvise.(If I have
> misunderstood something,
> please correct me.)

IIUC, it should be OK to reclaim all private anonymous pages of an
application in your specific use case?  If you really want to restrict
the number of pages reclaimed, it's possible too.  You can restrict the
size of address range to call process_madvise(MADV_PAGEOUT), and check
the RSS of the application.  The accuracy of the number reclaimed isn't
good.  But I think that it should OK in practice?

BTW: how do you know the number of pages to be reclaimed proactively in
memcg proactive reclaiming based solution?

--
Best Regards,
Huang, Ying

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ