[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220905092619.2533krnnx632hswc@suse.de>
Date: Mon, 5 Sep 2022 10:26:38 +0100
From: Mel Gorman <mgorman@...e.de>
To: Wupeng Ma <mawupeng1@...wei.com>
Cc: akpm@...ux-foundation.org, david@...hat.com, ying.huang@...el.com,
hannes@...xchg.org, corbet@....net, mcgrof@...nel.org,
keescook@...omium.org, yzaikin@...gle.com,
songmuchun@...edance.com, mike.kravetz@...cle.com,
osalvador@...e.de, surenb@...gle.com, rppt@...nel.org,
charante@...eaurora.org, jsavitz@...hat.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH -next v3 1/2] mm: Cap zone movable's min wmark to small
value
On Mon, Sep 05, 2022 at 11:28:57AM +0800, Wupeng Ma wrote:
> From: Ma Wupeng <mawupeng1@...wei.com>
>
> Since min_free_kbytes is based on gfp_zone(GFP_USER) which does not include
> zone movable. However zone movable will get its min share in
> __setup_per_zone_wmarks() which does not make any sense.
>
> And like highmem pages, __GFP_HIGH and PF_MEMALLOC allocations usually
> don't need movable pages, so there is no need to assign min pages for zone
> movable.
>
> Let's cap pages_min for zone movable to a small value here just link
> highmem pages.
>
I think there is a misunderstanding why the higher zones have a watermark
and why it might be large.
It's not about a __GFP_HIGH or PF_MEMALLOC allocations because it's known
that few of those allocations may be movable. It's because high memory
allocations indirectly pin pages in lower zones. User-mapped memory allocated
from ZONE_MOVABLE still needs page table pages allocated from a lower zone
so there is a ratio between the size of ZONE_MOVABLE and lower zones
that limits the total amount of memory that can be allocated. Similarly,
file backed pages that may be allocated from ZONE_MOVABLE still requires
pages from lower memory for the inode and other associated kernel
objects that are allocated from lower zones.
The intent behind the higher zones having a large min watermark is so
that kswapd reclaims pages from there first to *potentially* release
pages from lower memory. By capping pages_min for zone_movable, there is
the potential for lower memory pressure to be higher and to reach a point
where a ZONE_MOVABLE page cannot be allocated simply because there isn't
enough low memory available. Once the lower zones are all unreclaimable
(e.g. page table pages or the movable pages are not been reclaimed to free
the associated kernel structures), the system goes OOM.
It's possible that there are safe adjustments that could be made that
would detect when there is no choice except to reclaim zone reclaimable
but it would be tricky and it's not this patch. This patch changelog states
However zone movable will get its min share in
__setup_per_zone_wmarks() which does not make any sense.
It makes sense, higher zones allocations indirectly pin pages in lower
zones and there is a bias in reclaim to free the higher zone pages first
on the *possibility* that lower zone pages get indirectly released later.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists