lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YdV4k1+zEbtzmUkK@google.com>
Date:   Wed, 5 Jan 2022 03:53:07 -0700
From:   Yu Zhao <yuzhao@...gle.com>
To:     SeongJae Park <sj@...nel.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
        Jesse Barnes <jsbarnes@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Michael Larabel <Michael@...haellarabel.com>,
        Michal Hocko <mhocko@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Will Deacon <will@...nel.org>,
        Ying Huang <ying.huang@...el.com>,
        linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        page-reclaim@...gle.com, x86@...nel.org
Subject: Re: [PATCH v6 0/9] Multigenerational LRU Framework

On Wed, Jan 05, 2022 at 08:55:34AM +0000, SeongJae Park wrote:
> Hi Yu,
> 
> On Tue, 4 Jan 2022 13:22:19 -0700 Yu Zhao <yuzhao@...gle.com> wrote:
> 
> > TLDR
> > ====
> > The current page reclaim is too expensive in terms of CPU usage and it
> > often makes poor choices about what to evict. This patchset offers an
> > alternative solution that is performant, versatile and
> > straightforward.
> >  
> [...]
> > Summery
> > =======
> > The facts are:
> > 1. The independent lab results and the real-world applications
> >    indicate substantial improvements; there are no known regressions.
> 
> So impressive results!
> 
> > 2. Thrashing prevention, working set estimation and proactive reclaim
> >    work out of the box; there are no equivalent solutions.
> 
> I think similar works are already available out of the box with the latest
> mainline tree, though it might be suboptimal in some cases.

Ok, I will sound harsh because I hate it when people challenge facts
while having no idea what they are talking about.

Our jobs are help the leadership make best decisions by providing them
with facts, not feeding them crap.

Don't get me wrong -- you are welcome to start another thread and have
a casual discussion with me. But this thread is not for that; it's for
the leadership and stakeholder to make a decision. Check who are in
"To" and "Cc" and what my request is.

> I didn't read this patchset thoroughly yet, so I might missing many things.  If
> so, please feel free to let me know.

Yes, apparently you didn't read this patchset thoroughly, and you have
missed all things that matter to this thread.

> First, you can do thrashing prevention using DAMON-based Operation Scheme
> (DAMOS)[1] with MADV_COLD action.

Here is thrashing prevention really means, from patch 8:
  +Personal computers
  +------------------
  +:Thrashing prevention: Write ``N`` to
  + ``/sys/kernel/mm/lru_gen/min_ttl_ms`` to prevent the working set of
  + ``N`` milliseconds from getting evicted. The OOM killer is invoked if
  + this working set can't be kept in memory. Based on the average human
  + detectable lag (~100ms), ``N=1000`` usually eliminates intolerable
  + lags due to thrashing. Larger values like ``N=3000`` make lags less
  + noticeable at the cost of more OOM kills.

It's about when to trigger OOM kills. Got it? Or probably you don't
understand what MADV_COLD is either?

> Second, for working set estimation, you can either use the DAMOS
> again with statistics action, or the damon_aggregated tracepoint[2].

This is you are suggesting:
  TRACE_EVENT(damon_aggregated,
    TP_printk("target_id=%lu nr_regions=%u %lu-%lu: %u",
              __entry->target_id, __entry->nr_regions,
              __entry->start, __entry->end, __entry->nr_accesses)

Now read my doc again:
  +Data centers
  +------------
  +:Debugfs interface: ``/sys/kernel/debug/lru_gen`` has the following
  + format:
  +   memcg  memcg_id  memcg_path
  +     node  node_id

Have you heard of something called memcg? And NUMA node? How exactly
can this tracepoint provide information about different memcgs and
NUMA node?

> The DAMON user space tool[3] helps the tracepoint analysis and
> visualization.

What does "work out of box" mean? Should every Linux desktop, laptop
and phone user install this tool?

> Finally, for the proactive reclaim, you can again use the DAMOS
> with MADV_PAGEOUT action

How exactly does MADV_PAGEOUT find pages that are NOT mapped in page
tables? Let me tell you another fact: they are usually the cheapest to
reclaim.

> or simply the DAMON-based proactive reclaim module (DAMON_RECLAIM)[4].
> [4] https://docs.kernel.org/admin-guide/mm/damon/reclaim.html

How many knob does DAMON_RECLAIM have? 14? I lost count.

> Of course, the integration might not be so simple as seems to me now.

Look, I'm open to your suggestion. I probably should have been nicer.
So I'm sorry. I just don't appreciate alternative facts.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ