[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufZ4o4zmW_PyRCXWmBj4OVgVJdC6h1wZsJFMWpGxpzyGdg@mail.gmail.com>
Date: Wed, 14 Apr 2021 13:04:50 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Andi Kleen <ak@...ux.intel.com>
Cc: Rik van Riel <riel@...riel.com>,
Dave Chinner <david@...morbit.com>,
Jens Axboe <axboe@...nel.dk>,
SeongJae Park <sj38.park@...il.com>,
Linux-MM <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Benjamin Manes <ben.manes@...il.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Hillf Danton <hdanton@...a.com>,
Johannes Weiner <hannes@...xchg.org>,
Jonathan Corbet <corbet@....net>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Miaohe Lin <linmiaohe@...wei.com>,
Michael Larabel <michael@...haellarabel.com>,
Michal Hocko <mhocko@...e.com>,
Michel Lespinasse <michel@...pinasse.org>,
Roman Gushchin <guro@...com>,
Rong Chen <rong.a.chen@...el.com>,
SeongJae Park <sjpark@...zon.de>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Vlastimil Babka <vbabka@...e.cz>,
Yang Shi <shy828301@...il.com>,
Ying Huang <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
linux-kernel <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
Kernel Page Reclaim v2 <page-reclaim@...gle.com>
Subject: Re: [PATCH v2 00/16] Multigenerational LRU Framework
On Wed, Apr 14, 2021 at 9:51 AM Andi Kleen <ak@...ux.intel.com> wrote:
>
> > 2) It will not scan PTE tables under non-leaf PMD entries that do not
> > have the accessed bit set, when
> > CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG=y.
>
> This assumes that workloads have reasonable locality. Could there
> be a worst case where only one or two pages in each PTE are used,
> so this PTE skipping trick doesn't work?
Hi Andi,
Yes, it does make that assumption. And yes, there could. AFAIK, only
x86 supports this.
I wrote a crude test to verify this, and it maps exactly one page
within each PTE table. And I found page table scanning didn't
underperform the rmap:
https://lore.kernel.org/linux-mm/YHFuL%2FDdtiml4biw@google.com/#t
The reason (sorry for repeating this) is page table scanning is conditional:
bool should_skip_mm()
{
...
/* leave the legwork to the rmap if mapped pages are too sparse */
if (RSS < mm_pgtables_bytes(mm) / PAGE_SIZE)
return true;
....
}
We fall back to the rmap when it's obviously not smart to do so. There
is still a lot of room for improvement in this function though, i.e.,
it should be per VMA and NUMA aware.
Note that page table scanning doesn't replace the existing rmap scan.
It's complementary, and it happens when there is a good chance that
most of the pages on a system under pressure have been referenced.
IOW, scanning them one by one with the rmap would cost more than
scanning them all at once via page tables.
Sounds reasonable?
Thanks.
Powered by blists - more mailing lists