lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200225185130.6a32a8a6920d11b4c098e90e@linux-foundation.org>
Date:   Tue, 25 Feb 2020 18:51:30 -0800
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Michal Hocko <mhocko@...e.com>, Vlastimil Babka <vbabka@...e.cz>,
        Ivan Babrou <ivan@...udflare.com>,
        Rik van Riel <riel@...riel.com>,
        Linux-MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/3] Limit runaway reclaim due to watermark boosting

On Tue, 25 Feb 2020 14:15:31 +0000 Mel Gorman <mgorman@...hsingularity.net> wrote:

> Ivan Babrou reported the following

http://lkml.kernel.org/r/CABWYdi1eOUD1DHORJxTsWPMT3BcZhz++xP1pXhT=x4SgxtgQZA@mail.gmail.com
is helpful.

> 	Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when
> 	an external fragmentation event occurs") introduced undesired
> 	effects in our environment.
> 
> 	  * NUMA with 2 x CPU
> 	  * 128GB of RAM
> 	  * THP disabled
> 	  * Upgraded from 4.19 to 5.4
> 
> 	Before we saw free memory hover at around 1.4GB with no
> 	spikes. After the upgrade we saw some machines decide that they
> 	need a lot more than that, with frequent spikes above 10GB,
> 	often only on a single numa node.
> 
> There have been a few reports recently that might be watermark boost
> related. Unfortunately, finding someone that can reproduce the problem
> and test a patch has been problematic.  This series intends to limit
> potential damage only.

It's problematic that we don't understand what's happening.  And these
palliatives can only reduce our ability to do that.

Rik seems to have the means to reproduce this (or something similar)
and it seems Ivan can test patches three weeks hence.  So how about a
debug patch which will help figure out what's going on in there?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ