[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87cz3peval.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date: Thu, 27 Apr 2023 13:41:38 +0800
From: "Huang, Ying" <ying.huang@...el.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: linux-mm@...ck.org, Kaiyang Zhao <kaiyang2@...cmu.edu>,
Mel Gorman <mgorman@...hsingularity.net>,
Vlastimil Babka <vbabka@...e.cz>,
David Rientjes <rientjes@...gle.com>,
linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check
in kswapd
Johannes Weiner <hannes@...xchg.org> writes:
> On Wed, Apr 26, 2023 at 09:30:23AM +0800, Huang, Ying wrote:
>> Johannes Weiner <hannes@...xchg.org> writes:
>>
>> > On Tue, Apr 25, 2023 at 11:12:28AM +0800, Huang, Ying wrote:
>> >> Johannes Weiner <hannes@...xchg.org> writes:
>> >>
>> >> > Kswapd currently bails on higher-order allocations with an open-coded
>> >> > check for whether it's reclaimed the compaction gap.
>> >> >
>> >> > compaction_suitable() is the customary interface to coordinate reclaim
>> >> > with compaction.
>> >> >
>> >> > Signed-off-by: Johannes Weiner <hannes@...xchg.org>
>> >> > ---
>> >> > mm/vmscan.c | 67 ++++++++++++++++++-----------------------------------
>> >> > 1 file changed, 23 insertions(+), 44 deletions(-)
>> >> >
>> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> >> > index ee8c8ca2e7b5..723705b9e4d9 100644
>> >> > --- a/mm/vmscan.c
>> >> > +++ b/mm/vmscan.c
>> >> > @@ -6872,12 +6872,18 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx)
>> >> > if (!managed_zone(zone))
>> >> > continue;
>> >> >
>> >> > + /* Allocation can succeed in any zone, done */
>> >> > if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING)
>> >> > mark = wmark_pages(zone, WMARK_PROMO);
>> >> > else
>> >> > mark = high_wmark_pages(zone);
>> >> > if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx))
>> >> > return true;
>> >> > +
>> >> > + /* Allocation can't succeed, but enough order-0 to compact */
>> >> > + if (compaction_suitable(zone, order,
>> >> > + highest_zoneidx) == COMPACT_CONTINUE)
>> >> > + return true;
>> >>
>> >> Should we check the following first?
>> >>
>> >> order > 0 && zone_watermark_ok_safe(zone, 0, mark, highest_zoneidx)
>> >
>> > That's what compaction_suitable() does. It checks whether there are
>> > enough migration targets for compaction (COMPACT_CONTINUE) or whether
>> > reclaim needs to do some more work (COMPACT_SKIPPED).
>>
>> Yes. And I found that the watermark used in compaction_suitable() is
>> low_wmark_pages() or min_wmark_pages(), which doesn't match the
>> watermark here. Or did I miss something?
>
> Ahh, you're right, kswapd will bail prematurely. Compaction cannot
> reliably meet the high watermark with a low watermark scratch space.
>
> I'll add the order check before the suitable test, for clarity, and so
> that order-0 requests don't check the same thing twice.
>
> For the watermark, I'd make it an arg to compaction_suitable() and use
> whatever the reclaimer targets (high for kswapd, min for direct).
>
> However, there is a minor snag. compaction_suitable() currently has
> its own smarts regarding the watermark:
>
> /*
> * Watermarks for order-0 must be met for compaction to be able to
> * isolate free pages for migration targets. This means that the
> * watermark and alloc_flags have to match, or be more pessimistic than
> * the check in __isolate_free_page(). We don't use the direct
> * compactor's alloc_flags, as they are not relevant for freepage
> * isolation. We however do use the direct compactor's highest_zoneidx
> * to skip over zones where lowmem reserves would prevent allocation
> * even if compaction succeeds.
> * For costly orders, we require low watermark instead of min for
> * compaction to proceed to increase its chances.
> * ALLOC_CMA is used, as pages in CMA pageblocks are considered
> * suitable migration targets
> */
> watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ?
> low_wmark_pages(zone) : min_wmark_pages(zone);
>
> Historically it has always checked low instead of min. Then Vlastimil
> changed it to min for non-costly orders here:
>
> commit 8348faf91f56371d4bada6fc5915e19580a15ffe
> Author: Vlastimil Babka <vbabka@...e.cz>
> Date: Fri Oct 7 16:58:00 2016 -0700
>
> mm, compaction: require only min watermarks for non-costly orders
>
> The __compaction_suitable() function checks the low watermark plus a
> compact_gap() gap to decide if there's enough free memory to perform
> compaction. Then __isolate_free_page uses low watermark check to decide
> if particular free page can be isolated. In the latter case, using low
> watermark is needlessly pessimistic, as the free page isolations are
> only temporary. For __compaction_suitable() the higher watermark makes
> sense for high-order allocations where more freepages increase the
> chance of success, and we can typically fail with some order-0 fallback
> when the system is struggling to reach that watermark. But for
> low-order allocation, forming the page should not be that hard. So
> using low watermark here might just prevent compaction from even trying,
> and eventually lead to OOM killer even if we are above min watermarks.
>
> So after this patch, we use min watermark for non-costly orders in
> __compaction_suitable(), and for all orders in __isolate_free_page().
>
> Lowering to min wasn't an issue for non-costly, but AFAICS there was
> no explicit testing for whether min would work for costly orders too.
>
> I'd propose trying it with min even for costly and see what happens.
>
> If it does regress, a better place to boost scratch space for costly
> orders might be compact_gap(), so I'd move it there.
>
> Does that sound reasonable?
Sounds good to me, Thanks!
Best Regards,
Huang, Ying
Powered by blists - more mailing lists