lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220105085534.22981-1-sj@kernel.org>
Date:   Wed,  5 Jan 2022 08:55:34 +0000
From:   SeongJae Park <sj@...nel.org>
To:     Yu Zhao <yuzhao@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
        Jesse Barnes <jsbarnes@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Michael Larabel <Michael@...haellarabel.com>,
        Michal Hocko <mhocko@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Will Deacon <will@...nel.org>,
        Ying Huang <ying.huang@...el.com>,
        linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        page-reclaim@...gle.com, x86@...nel.org
Subject: Re: [PATCH v6 0/9] Multigenerational LRU Framework

Hi Yu,

On Tue, 4 Jan 2022 13:22:19 -0700 Yu Zhao <yuzhao@...gle.com> wrote:

> TLDR
> ====
> The current page reclaim is too expensive in terms of CPU usage and it
> often makes poor choices about what to evict. This patchset offers an
> alternative solution that is performant, versatile and
> straightforward.
>  
[...]
> Summery
> =======
> The facts are:
> 1. The independent lab results and the real-world applications
>    indicate substantial improvements; there are no known regressions.

So impressive results!

> 2. Thrashing prevention, working set estimation and proactive reclaim
>    work out of the box; there are no equivalent solutions.

I think similar works are already available out of the box with the latest
mainline tree, though it might be suboptimal in some cases.

First, you can do thrashing prevention using DAMON-based Operation Scheme
(DAMOS)[1] with MADV_COLD action.  Second, for working set estimation, you can
either use the DAMOS again with statistics action, or the damon_aggregated
tracepoint[2].  The DAMON user space tool[3] helps the tracepoint analysis and
visualization.  Finally, for the proactive reclaim, you can again use the DAMOS
with MADV_PAGEOUT action, or simply the DAMON-based proactive reclaim
module (DAMON_RECLAIM)[4].

Nevertheless, as noted above, current DAMON based solutions might be suboptimal
for some cases.  First of all, DAMON currently doesn't provide page granularity
monitoring.  Though its monitoring results were useful for our users'
production usages, there could be different requirements and situations.
Secondly, the DAMON-based thrashing prevention wouldn't reduce the CPU usage of
the reclamation logic's access scanning.

So, to me, MGLRU patchset looks providing something that DAMON doesn't provide,
but also something that DAMON is already providing.  Specifically, the
efficient page granularity access scanning is what DAMON doesn't provide for
now.  However, the utilization of the access information for LRU list
manipulation (thrashing prevention) and proactive reclamation is similar to
what DAMON (specifically, DAMOS) provides.  Also, this patchset is reducing the
reclamation logic's CPU usage using the efficient page granularity access
scanning.

IMHO, we might be able to reduce the duplicates by integrating MGLRU in DAMON.
What I'm saying is, we could 1) introduce the efficient page granularity access
scanning, 2) reduce the reclamation logic's CPU usage by making it to use the
efficient page granularity access scanning, and 3) extend DAMON for page
granularity monitoring with the efficient access sacanning[5].  Then, users
could get the benefit of MGLRU by using DAMOS but setting it to use your
efficient page granularity access scanning.  To make it more simple, we can
extend existing kernel logics to use DAMON in the way, or implement a new
kernel module.  Additional advantages of this approach would be 1) reducing the
changes to the existing code, and 2) making the efficient page granularity
access information be utilized for more general cases.

Of course, the integration might not be so simple as seems to me now.  We could
put DAMON and MGLRU together as those are for now, and let users select what
they really want.  I think it's up to you.

I didn't read this patchset thoroughly yet, so I might missing many things.  If
so, please feel free to let me know.

[1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes
[2] https://docs.kernel.org/admin-guide/mm/damon/usage.html#tracepoint-for-monitoring-results
[3] https://github.com/awslabs/damo
[4] https://docs.kernel.org/admin-guide/mm/damon/reclaim.html
[5] https://docs.kernel.org/vm/damon/design.html#configurable-layers


Thanks,
SJ

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ