linux-kernel - Re: [PATCH v2 00/16] Multigenerational LRU Framework

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20210413075155.32652-1-sjpark@amazon.de>
Date:   Tue, 13 Apr 2021 07:51:55 +0000
From:   SeongJae Park <sj38.park@...il.com>
To:     Yu Zhao <yuzhao@...gle.com>
Cc:     linux-mm@...ck.org, Andi Kleen <ak@...ux.intel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Benjamin Manes <ben.manes@...il.com>,
        Dave Chinner <david@...morbit.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Michael Larabel <michael@...haellarabel.com>,
        Michal Hocko <mhocko@...e.com>,
        Michel Lespinasse <michel@...pinasse.org>,
        Rik van Riel <riel@...riel.com>,
        Roman Gushchin <guro@...com>,
        Rong Chen <rong.a.chen@...el.com>,
        SeongJae Park <sjpark@...zon.de>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Yang Shi <shy828301@...il.com>,
        Ying Huang <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
        linux-kernel@...r.kernel.org, lkp@...ts.01.org,
        page-reclaim@...gle.com
Subject: Re: [PATCH v2 00/16] Multigenerational LRU Framework

From: SeongJae Park <sjpark@...zon.de>

Hello,


Very interesting work, thank you for sharing this :)

On Tue, 13 Apr 2021 00:56:17 -0600 Yu Zhao <yuzhao@...gle.com> wrote:

> What's new in v2
> ================
> Special thanks to Jens Axboe for reporting a regression in buffered
> I/O and helping test the fix.

Is the discussion open?  If so, could you please give me a link?

> 
> This version includes the support of tiers, which represent levels of
> usage from file descriptors only. Pages accessed N times via file
> descriptors belong to tier order_base_2(N). Each generation contains
> at most MAX_NR_TIERS tiers, and they require additional MAX_NR_TIERS-2
> bits in page->flags. In contrast to moving across generations which
> requires the lru lock, moving across tiers only involves an atomic
> operation on page->flags and therefore has a negligible cost. A
> feedback loop modeled after the well-known PID controller monitors the
> refault rates across all tiers and decides when to activate pages from
> which tiers, on the reclaim path.
> 
> This feedback model has a few advantages over the current feedforward
> model:
> 1) It has a negligible overhead in the buffered I/O access path
>    because activations are done in the reclaim path.
> 2) It takes mapped pages into account and avoids overprotecting pages
>    accessed multiple times via file descriptors.
> 3) More tiers offer better protection to pages accessed more than
>    twice when buffered-I/O-intensive workloads are under memory
>    pressure.
> 
> The fio/io_uring benchmark shows 14% improvement in IOPS when randomly
> accessing Samsung PM981a in the buffered I/O mode.

Improvement under memory pressure, right?  How much pressure?

[...]
> 
> Differential scans via page tables
> ----------------------------------
> Each differential scan discovers all pages that have been referenced
> since the last scan. Specifically, it walks the mm_struct list
> associated with an lruvec to scan page tables of processes that have
> been scheduled since the last scan.

Does this means it scans only virtual address spaces of processes and therefore
pages in the page cache that are not mmap()-ed will not be scanned?

> The cost of each differential scan
> is roughly proportional to the number of referenced pages it
> discovers. Unless address spaces are extremely sparse, page tables
> usually have better memory locality than the rmap. The end result is
> generally a significant reduction in CPU usage, for workloads using a
> large amount of anon memory.

When and how frequently it scans?


Thanks,
SeongJae Park

[...]