linux-kernel - Re: [RFC 0/4] Introduce unbalance proactive reclaim

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZVNC9mDOOWgjen7U@tiehlicka>
Date:   Tue, 14 Nov 2023 10:50:46 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Huan Yang <link@...o.com>
Cc:     "Huang, Ying" <ying.huang@...el.com>, Tejun Heo <tj@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        Peter Xu <peterx@...hat.com>,
        "Vishal Moola (Oracle)" <vishal.moola@...il.com>,
        Yosry Ahmed <yosryahmed@...gle.com>,
        Liu Shixin <liushixin2@...wei.com>,
        Hugh Dickins <hughd@...gle.com>, cgroups@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, opensource.kernel@...o.com
Subject: Re: [RFC 0/4] Introduce unbalance proactive reclaim

On Mon 13-11-23 10:17:57, Huan Yang wrote:
> 
> 在 2023/11/10 20:24, Michal Hocko 写道:
> > On Fri 10-11-23 11:48:49, Huan Yang wrote:
> > [...]
> > > Also, When the application enters the foreground, the startup speed
> > > may be slower. Also trace show that here are a lot of block I/O.
> > > (usually 1000+ IO count and 200+ms IO Time) We usually observe very
> > > little block I/O caused by zram refault.(read: 1698.39MB/s, write:
> > > 995.109MB/s), usually, it is faster than random disk reads.(read:
> > > 48.1907MB/s write: 49.1654MB/s). This test by zram-perf and I change a
> > > little to test UFS.
> > > 
> > > Therefore, if the proactive reclamation encounters many file pages,
> > > the application may become slow when it is opened.
> > OK, this is an interesting information. From the above it seems that
> > storage based IO refaults are order of magnitude more expensive than
> > swap (zram in this case). That means that the memory reclaim should
> > _in general_ prefer anonymous memory reclaim over refaulted page cache,
> > right? Or is there any reason why "frozen" applications are any
> > different in this case?
> Frozen applications mean that the application process is no longer active,
> so once its private anonymous page data is swapped out, the anonymous
> pages will not be refaulted until the application becomes active again.

I was probably not clear in my question. It is quite clear that frozen
applications are inactive. It is not really clear why they should be
treated any differently though. Their memory will be naturally cold as
the memory is not in use so why cannot we realy on the standard memory
reclaim to deal with the implicit inactivity and you need to handle that
explicitly?

[...]
> > Our traditional interface to control the anon vs. file balance has been
> > swappiness. It is not the best interface and it has its flaws but
> > have you experimented with the global swappiness to express that
> > preference? What were your observations? Please note that the behavior
> We have tested this part and found that no version of the code has the
> priority control over swappiness.
> 
> This means that even if we modify swappiness to 0 or 200,
> we cannot achieve the goal of unbalanced reclaim if some conditions
> are not met during the reclaim process. Under certain conditions,
> we may mistakenly reclaim file pages, and since we usually trigger
> active reclaim when there is sufficient memory(before LMKD trigger),
> this will cause higher block IO.

Yes there are heuristics which might override the global swappinness but
have you investigated those cases and can show that those heuristics
could be changed?

[...]
> > It is quite likely that a IO cost aspect is not really easy to integrate
> > into the memory reclaim but it seems to me this is a better way to focus
> > on for a better long term solution. Our existing refaulting
> > infrastructure should help in that respect. Also MGLRU could fit for
> > that purpose better than the traditional LRU based reclaim as the higher
> > generations could be used for more more expensive pages.
> 
> Yes, your insights are very informative.
> 
> However, before our algorithm is perfected, I think it is reasonable
> to provide different reclaim tendencies for the active reclaim
> interface. This will provide greater flexibility for the strategy
> layer.

Flexibility is really nice but it comes with a price and interface cost
can be really high. There were several attempts to make memory reclaim
LRU type specific but I still maintain my opinion that this is not
really a good abstraction. As stated above even page cache is not all
the same. A more future proof interface should really consider the IO
refault cost rather than all anon/file.

-- 
Michal Hocko
SUSE Labs