[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200226080426.GA3818@techsingularity.net>
Date: Wed, 26 Feb 2020 08:04:26 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Michal Hocko <mhocko@...e.com>, Vlastimil Babka <vbabka@...e.cz>,
Ivan Babrou <ivan@...udflare.com>,
Rik van Riel <riel@...riel.com>,
Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/3] Limit runaway reclaim due to watermark boosting
On Tue, Feb 25, 2020 at 06:51:30PM -0800, Andrew Morton wrote:
> On Tue, 25 Feb 2020 14:15:31 +0000 Mel Gorman <mgorman@...hsingularity.net> wrote:
>
> > Ivan Babrou reported the following
>
> http://lkml.kernel.org/r/CABWYdi1eOUD1DHORJxTsWPMT3BcZhz++xP1pXhT=x4SgxtgQZA@mail.gmail.com
> is helpful.
>
Noted for future reference.
> > Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when
> > an external fragmentation event occurs") introduced undesired
> > effects in our environment.
> >
> > * NUMA with 2 x CPU
> > * 128GB of RAM
> > * THP disabled
> > * Upgraded from 4.19 to 5.4
> >
> > Before we saw free memory hover at around 1.4GB with no
> > spikes. After the upgrade we saw some machines decide that they
> > need a lot more than that, with frequent spikes above 10GB,
> > often only on a single numa node.
> >
> > There have been a few reports recently that might be watermark boost
> > related. Unfortunately, finding someone that can reproduce the problem
> > and test a patch has been problematic. This series intends to limit
> > potential damage only.
>
> It's problematic that we don't understand what's happening. And these
> palliatives can only reduce our ability to do that.
>
Not for certain no, but we do know that there are conditions whereby
node 0 can end up reclaiming excessively for extended periods of time.
The available evidence does match a pattern whereby a lower zone on node
0 is getting stuck in a boosted state.
> Rik seems to have the means to reproduce this (or something similar)
> and it seems Ivan can test patches three weeks hence.
If Rik can reproduce it great but I have a strong feeling that Ivan may
never be able to test this if it requires a production machine which is
why I did not wait the three weeks.
> So how about a
> debug patch which will help figure out what's going on in there?
A debug patch would not help much in this case given that we
have tracepoints. An ftrace containing mm_page_alloc_extfrag,
mm_vmscan_kswapd_wake, mm_vmscan_wakeup_kswapd and
mm_vmscan_node_reclaim_begin would be a big help for 30 seconds while the
problem is occurring would work. Ideally mm_vmscan_lru_shrink_inactive
would also be included to capture the priority but the size of the trace
is what's going to be problematic.
mm_page_alloc_extfrag would be correlated with the conditions that boost
the watermarks and the others would track what kswapd is doing to see if
it's persistently reclaiming. If they are, mm_vmscan_lru_shrink_inactive
would tell if it's persistently reclaiming at priority DEF_PRIORITY - 2
which would prove the patch would at least mitigate the problem.
It would be more preferable to have a description of a testcase that
reproduces the problem and I'll capture/analyse the trace myself.
It would also be something I could slot into a test grid to catch the
problem happening again in the future.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists