lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 21 Mar 2022 23:55:45 -0600
From:   Yu Zhao <yuzhao@...gle.com>
To:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
        Jesse Barnes <jsbarnes@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Michael Larabel <Michael@...haellarabel.com>,
        Michal Hocko <mhocko@...nel.org>,
        Mike Rapoport <rppt@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Will Deacon <will@...nel.org>,
        Ying Huang <ying.huang@...el.com>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        Kernel Page Reclaim v2 <page-reclaim@...gle.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        Brian Geffon <bgeffon@...gle.com>,
        Jan Alexander Steffens <heftig@...hlinux.org>,
        Oleksandr Natalenko <oleksandr@...alenko.name>,
        Steven Barrett <steven@...uorix.net>,
        Suleiman Souhlal <suleiman@...gle.com>,
        Daniel Byrne <djbyrne@....edu>,
        Donald Carr <d@...os-reins.com>,
        Holger Hoffstätte <holger@...lied-asynchrony.com>,
        Konstantin Kharlamov <Hi-Angel@...dex.ru>,
        Shuang Zhai <szhai2@...rochester.edu>,
        Sofia Trinh <sofia.trinh@....works>,
        Vaibhav Jain <vaibhav@...ux.ibm.com>
Subject: Re: [PATCH v9 06/14] mm: multi-gen LRU: minimal implementation

On Mon, Mar 21, 2022 at 11:27 PM Aneesh Kumar K.V
<aneesh.kumar@...ux.ibm.com> wrote:
>
> Yu Zhao <yuzhao@...gle.com> writes:
>
> > +static void inc_max_seq(struct lruvec *lruvec, unsigned long max_seq)
> > +{
> > +     int prev, next;
> > +     int type, zone;
> > +     struct lru_gen_struct *lrugen = &lruvec->lrugen;
> > +
> > +     spin_lock_irq(&lruvec->lru_lock);
> > +
> > +     VM_BUG_ON(!seq_is_valid(lruvec));
> > +
> > +     if (max_seq != lrugen->max_seq)
> > +             goto unlock;
> > +
> > +     inc_min_seq(lruvec);
>
> Can this min seq update result in pages considered oldest become young.
> ie, if we had seq value of 0 - 3 and we need ageing, the new min seq and
> max_seq value will now become 1 - 4. What happens to pages in the
> generation value 0 which was oldest generation earlier and is youngest
> now.

If anon pages are not reclaimable, e.g., no swapfile, they won't be
scanned at all. So their coldness/hotness don't matter -- they don't
need to be on lrugen->lists[] at all.

If there is a swapfile but it's full, then yes, the inversion will
happen. This can be handled by moving pages from the oldest generation
to the tail of the second oldest generation, which maintains the LRU
order.

In fact, both were handled in the previous versions [1] [2]. They were
removed in v6 for simplicity.

[1] https://lore.kernel.org/linux-mm/20211111041510.402534-5-yuzhao@google.com/
[2] https://lore.kernel.org/linux-mm/20211111041510.402534-7-yuzhao@google.com/

> > +static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swappiness)
> > +{
> > +     int type;
> > +     int scanned;
> > +     int reclaimed;
> > +     LIST_HEAD(list);
> > +     struct folio *folio;
> > +     enum vm_event_item item;
> > +     struct reclaim_stat stat;
> > +     struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> > +     struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> > +
> > +     spin_lock_irq(&lruvec->lru_lock);
> > +
> > +     scanned = isolate_folios(lruvec, sc, swappiness, &type, &list);
> > +
> > +     if (try_to_inc_min_seq(lruvec, swappiness))
> > +             scanned++;
>
> we are doing this before we shrink the page list. Any reason to do this before?

We have isolated pages from lrugen->lists[], and we might have
exhausted all pages in the oldest generations, i.e.,
lrugen->lists[min_seq] is now empty. Incrementing min_seq after
shrink_page_list() is not wrong. However, it's better we do it ASAP so
that concurrent reclaimers are less likely to see a stale min_seq and
come here under the false impression that they'd make some progress.
(Instead, they will go to the aging path and inc_max_seq() first
before coming here.)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ