lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABWYdi1eOUD1DHORJxTsWPMT3BcZhz++xP1pXhT=x4SgxtgQZA@mail.gmail.com>
Date:   Fri, 7 Feb 2020 14:54:43 -0800
From:   Ivan Babrou <ivan@...udflare.com>
To:     linux-mm@...ck.org
Cc:     linux-kernel <linux-kernel@...r.kernel.org>,
        kernel-team <kernel-team@...udflare.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Reclaim regression after 1c30844d2dfe

This change from 5.5 times:

* https://github.com/torvalds/linux/commit/1c30844d2dfe

> mm: reclaim small amounts of memory when an external fragmentation event occurs

Introduced undesired effects in our environment.

* NUMA with 2 x CPU
* 128GB of RAM
* THP disabled
* Upgraded from 4.19 to 5.4

Before we saw free memory hover at around 1.4GB with no spikes. After
the upgrade we saw some machines decide that they need a lot more than
that, with frequent spikes above 10GB, often only on a single numa
node.

We can see kswapd quite active in balance_pgdat (it didn't look like
it slept at all):

$ ps uax | fgrep kswapd
root       1850 23.0  0.0      0     0 ?        R    Jan30 1902:24 [kswapd0]
root       1851  1.8  0.0      0     0 ?        S    Jan30 152:16 [kswapd1]

This in turn massively increased pressure on page cache, which did not
go well to services that depend on having a quick response from a
local cache backed by solid storage.

Here's how it looked like when I zeroed vm.watermark_boost_factor:

* https://imgur.com/a/6IZWicU

IO subsided from 100% busy in page cache population at 300MB/s on a
single SATA drive down to under 100MB/s.

This sort of regression doesn't seem like a good thing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ