lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufZ4o4zmW_PyRCXWmBj4OVgVJdC6h1wZsJFMWpGxpzyGdg@mail.gmail.com>
Date:   Wed, 14 Apr 2021 13:04:50 -0600
From:   Yu Zhao <yuzhao@...gle.com>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     Rik van Riel <riel@...riel.com>,
        Dave Chinner <david@...morbit.com>,
        Jens Axboe <axboe@...nel.dk>,
        SeongJae Park <sj38.park@...il.com>,
        Linux-MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Benjamin Manes <ben.manes@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Michael Larabel <michael@...haellarabel.com>,
        Michal Hocko <mhocko@...e.com>,
        Michel Lespinasse <michel@...pinasse.org>,
        Roman Gushchin <guro@...com>,
        Rong Chen <rong.a.chen@...el.com>,
        SeongJae Park <sjpark@...zon.de>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Yang Shi <shy828301@...il.com>,
        Ying Huang <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
        linux-kernel <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        Kernel Page Reclaim v2 <page-reclaim@...gle.com>
Subject: Re: [PATCH v2 00/16] Multigenerational LRU Framework

On Wed, Apr 14, 2021 at 9:51 AM Andi Kleen <ak@...ux.intel.com> wrote:
>
> >    2) It will not scan PTE tables under non-leaf PMD entries that do not
> >       have the accessed bit set, when
> >       CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG=y.
>
> This assumes  that workloads have reasonable locality. Could there
> be a worst case where only one or two pages in each PTE are used,
> so this PTE skipping trick doesn't work?

Hi Andi,

Yes, it does make that assumption. And yes, there could. AFAIK, only
x86 supports this.

I wrote a crude test to verify this, and it maps exactly one page
within each PTE table. And I found page table scanning didn't
underperform the rmap:

https://lore.kernel.org/linux-mm/YHFuL%2FDdtiml4biw@google.com/#t

The reason (sorry for repeating this) is page table scanning is conditional:

bool should_skip_mm()
{
    ...
    /* leave the legwork to the rmap if mapped pages are too sparse */
    if (RSS < mm_pgtables_bytes(mm) / PAGE_SIZE)
        return true;
    ....
}

We fall back to the rmap when it's obviously not smart to do so. There
is still a lot of room for improvement in this function though, i.e.,
it should be per VMA and NUMA aware.

Note that page table scanning doesn't replace the existing rmap scan.
It's complementary, and it happens when there is a good chance that
most of the pages on a system under pressure have been referenced.
IOW, scanning them one by one with the rmap would cost more than
scanning them all at once via page tables.

Sounds reasonable?

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ