lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YFEjMCeVv9MARvp3@google.com>
Date:   Tue, 16 Mar 2021 15:29:20 -0600
From:   Yu Zhao <yuzhao@...gle.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     linux-mm@...ck.org, Alex Shi <alex.shi@...ux.alibaba.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Mel Gorman <mgorman@...e.de>, Michal Hocko <mhocko@...e.com>,
        Roman Gushchin <guro@...com>, Vlastimil Babka <vbabka@...e.cz>,
        Wei Yang <richard.weiyang@...ux.alibaba.com>,
        Yang Shi <shy828301@...il.com>,
        Ying Huang <ying.huang@...el.com>,
        linux-kernel@...r.kernel.org, page-reclaim@...gle.com
Subject: Re: [PATCH v1 11/14] mm: multigenerational lru: page activation

On Tue, Mar 16, 2021 at 04:34:37PM +0000, Matthew Wilcox wrote:
> On Sat, Mar 13, 2021 at 12:57:44AM -0700, Yu Zhao wrote:
> > In the page fault path, we want to add pages to the per-zone lists
> > index by max_seq as they cannot be evicted without going through
> > the aging first. For anon pages, we rename
> > lru_cache_add_inactive_or_unevictable() to lru_cache_add_page_vma()
> > and add a new parameter, which is set to true in the page fault path,
> > to indicate whether they should be added to the per-zone lists index
> > by max_seq. For page/swap cache, since we cannot differentiate the
> > page fault path from the read ahead path at the time we call
> > lru_cache_add() in add_to_page_cache_lru() and
> > __read_swap_cache_async(), we have to add a new function
> > lru_gen_activate_page(), which is essentially activate_page(), to move
> > pages to the per-zone lists indexed by max_seq at a later time.
> > Hopefully we would find pages we want to activate in lru_pvecs.lru_add
> > and simply set PageActive() on them without having to actually move
> > them.
> > 
> > In the reclaim path, pages mapped around a referenced PTE may also
> > have been referenced due to spatial locality. We add a new function
> > lru_gen_scan_around() to scan the vicinity of such a PTE.
> > 
> > In addition, we add a new function page_is_active() to tell whether a
> > page is active. We cannot use PageActive() because it is only set on
> > active pages while they are not on multigenerational lru. It is
> > cleared while pages are on multigenerational lru, in order to spare
> > the aging the trouble of clearing it when an active generation becomes
> > inactive. Internally, page_is_active() compares the generation number
> > of a page with max_seq and max_seq-1, which are active generations and
> > protected from the eviction. Other generations, which may or may not
> > exist, are inactive.
> 
> If we go with this multi-LRU approach, it feels like PageActive and
> PageInactive should go away as tests.  We should have a LRU field in
> the page flags with some special values:
> 
>  - Not managed through LRU list
>  - Not currently on any LRU list
>  - Unevictable
>  - Active list 1
>  - Active list 2
>  - ...
>  - Active list 5
> 
> Now you don't need any extra bits in the page flags.  Or if you want to
> have 13 lists instead of 5, you can use just one extra bit.  I'm not
> quite sure whether it makes sense to have that many lists, so I need
> to try to understand that better.

Yes, and this would be a lot cleaner. PG_{lru,unevictable,active,
referenced,reclaim,workingset,young,idle} could all go away. Look how
many bits we've added just for page reclaim. Sigh...

> I'd like to echo the comments from others that it'd be nice to split apart
> the multigenerational part of this and the physical scanning part of this.
> It's possible they don't make performance sense without each other,
> but from a review point of view, they seem entirely separate things.

Thanks for noticing. I do plan to see if the page table scanning part
could be better refactored. (I cut some corners by squashing it while
rebasing to latest kernel.)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ