lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 28 Nov 2021 04:26:35 +0900
From:   Alexey Avramov <hakavlad@...ox.lv>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Rik van Riel <riel@...riel.com>,
        Mike Galbraith <efault@....de>,
        Darrick Wong <djwong@...nel.org>, regressions@...ts.linux.dev,
        Linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/1] mm: vmscan: Reduce throttling due to a failure to
 make progress

I will present the results of the new tests here. 

TLDR;
=====
No one Mel's patch doesn't prevent stalls in my tests.

Test case:
Running firefox with youtube tab (basic desktop workload) and starting
$ for i in {1..10}; do tail /dev/zero; done
-- 1. with noswap and 2. with SwapTotal > MemTotal and swappiness=0.

I will log PSI metrics using latest psi2log (it's part of nohang [1]
package, src: [2]) and log some meminfo metrics using mem2log [3].

psi2log cmd:
$ psi2log -m 2 -i 1 -l /tmpfs/psi

mem2log cmd:
$ mem2log -i 1 -l /tmpfs/mem

The OS is Debian 9 x86_64 on HDD with ext4+LUKS, MemTotal: 11.5 GiB.

Repo with logs:
https://github.com/hakavlad/cache-tests/tree/main/516-reclaim-throttle

5.15
====

- With noswap, psi2log summary:

2021-11-27 20:51:14,018: Stall times for the last 141.6s:
2021-11-27 20:51:14,018: -----------
2021-11-27 20:51:14,018: some cpu     1.3s, avg 0.9%
2021-11-27 20:51:14,018: -----------
2021-11-27 20:51:14,018: some io      99.1s, avg 70.0%
2021-11-27 20:51:14,018: full io      93.4s, avg 66.0%
2021-11-27 20:51:14,018: -----------
2021-11-27 20:51:14,019: some memory  4.3s, avg 3.1%
2021-11-27 20:51:14,019: full memory  4.2s, avg 3.0%

-- average full memory stall is 0.4s per one `tail /dev/zero`.

full psi2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/noswap/psi

full mem2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/noswap/mem

- With swappiness=0 and SwapTotal > 0:

2021-11-27 20:57:30,217: Stall times for the last 141.6s:
2021-11-27 20:57:30,217: -----------
2021-11-27 20:57:30,217: some cpu     1.2s, avg 0.8%
2021-11-27 20:57:30,217: -----------
2021-11-27 20:57:30,217: some io      100.7s, avg 71.1%
2021-11-27 20:57:30,218: full io      94.8s, avg 67.0%
2021-11-27 20:57:30,218: -----------
2021-11-27 20:57:30,218: some memory  3.9s, avg 2.8%
2021-11-27 20:57:30,218: full memory  3.9s, avg 2.7%

-- 0.4s memory stall per one `tail /dev/zero`.

full psi2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/swappiness0/psi

full mem2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/515/swappiness0/mem


5.16-rc2 vanilla
================

- with noswap

psi2log summary:

2021-11-27 21:45:09,307: Stall times for the last 1086.5s:
2021-11-27 21:45:09,307: -----------
2021-11-27 21:45:09,307: some cpu     4.2s, avg 0.4%
2021-11-27 21:45:09,307: -----------
2021-11-27 21:45:09,307: some io      337.6s, avg 31.1%
2021-11-27 21:45:09,307: full io      325.2s, avg 29.9%
2021-11-27 21:45:09,307: -----------
2021-11-27 21:45:09,307: some memory  888.7s, avg 81.8%
2021-11-27 21:45:09,307: full memory  886.2s, avg 81.6%

average full memory stall is 87s per one `tail /dev/zero`.

full psi2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/noswap/psi

full mem2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/noswap/mem

- And even worst with swappiness=0:

2021-11-27 22:33:19,207: Stall times for the last 2677.7s:
2021-11-27 22:33:19,208: -----------
2021-11-27 22:33:19,208: some cpu     14.0s, avg 0.5%
2021-11-27 22:33:19,208: -----------
2021-11-27 22:33:19,208: some io      306.7s, avg 11.5%
2021-11-27 22:33:19,208: full io      292.4s, avg 10.9%
2021-11-27 22:33:19,208: -----------
2021-11-27 22:33:19,208: some memory  2504.8s, avg 93.5%
2021-11-27 22:33:19,209: full memory  2498.2s, avg 93.3%

What a horror!

In dmesg: a lot of
[] Purging GPU memory, 0 pages freed, 0 pages still pinned, 2112 pages left available.
(and dying journald)

psi2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/swappiness0/psi

mem2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/swappiness0/mem

dmesg:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/vanilla/dmesg

5.16-rc2 and 1st Mel's patch [4]
================================

- with noswap

Summary:

2021-11-27 23:28:23,707: Stall times for the last 1035.1s:
2021-11-27 23:28:23,707: -----------
2021-11-27 23:28:23,707: some cpu     5.6s, avg 0.5%
2021-11-27 23:28:23,708: -----------
2021-11-27 23:28:23,708: some io      336.5s, avg 32.5%
2021-11-27 23:28:23,708: full io      322.9s, avg 31.2%
2021-11-27 23:28:23,708: -----------
2021-11-27 23:28:23,708: some memory  851.1s, avg 82.2%
2021-11-27 23:28:23,708: full memory  848.3s, avg 82.0%

full psi2log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch1/noswap/psi

mem2log log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch1/noswap/mem

- with swappiness=0

Summary:

2021-11-27 23:50:25,722: Stall times for the last 1076.9s:
2021-11-27 23:50:25,722: -----------
2021-11-27 23:50:25,722: some cpu     12.2s, avg 1.1%
2021-11-27 23:50:25,722: -----------
2021-11-27 23:50:25,723: some io      232.1s, avg 21.6%
2021-11-27 23:50:25,723: full io      218.5s, avg 20.3%
2021-11-27 23:50:25,723: -----------
2021-11-27 23:50:25,723: some memory  957.5s, avg 88.9%
2021-11-27 23:50:25,723: full memory  951.2s, avg 88.3%

It's better than vanilla.

5.16-rc2 and 2nd Mel's patch [5]
================================

- with noswap

Summary:

2021-11-28 00:29:38,558: Stall times for the last 598.7s:
2021-11-28 00:29:38,558: -----------
2021-11-28 00:29:38,558: some cpu     5.2s, avg 0.9%
2021-11-28 00:29:38,558: -----------
2021-11-28 00:29:38,559: some io      235.0s, avg 39.3%
2021-11-28 00:29:38,559: full io      220.2s, avg 36.8%
2021-11-28 00:29:38,559: -----------
2021-11-28 00:29:38,559: some memory  439.0s, avg 73.3%
2021-11-28 00:29:38,559: full memory  437.5s, avg 73.1%

It's significant improvement, but far from the expected result.

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/noswap/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/noswap/psi

In some cases the stall was quite short.

- with swappiness=0

Summary:

2021-11-28 00:53:41,659: Stall times for the last 1239.0s:
2021-11-28 00:53:41,659: -----------
2021-11-28 00:53:41,660: some cpu     9.4s, avg 0.8%
2021-11-28 00:53:41,660: -----------
2021-11-28 00:53:41,660: some io      206.5s, avg 16.7%
2021-11-28 00:53:41,660: full io      195.0s, avg 15.7%
2021-11-28 00:53:41,660: -----------
2021-11-28 00:53:41,660: some memory  1113.0s, avg 89.8%
2021-11-28 00:53:41,661: full memory  1107.1s, avg 89.3%

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/swappiness0/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch2/swappiness0/psi


5.16-rc2 and 3rd Mel's patch [6] on top of 2nd
==============================================

- with noswap

Summary:

2021-11-28 01:35:26,077: Stall times for the last 949.4s:
2021-11-28 01:35:26,077: -----------
2021-11-28 01:35:26,078: some cpu     5.4s, avg 0.6%
2021-11-28 01:35:26,078: -----------
2021-11-28 01:35:26,078: some io      269.0s, avg 28.3%
2021-11-28 01:35:26,078: full io      259.4s, avg 27.3%
2021-11-28 01:35:26,078: -----------
2021-11-28 01:35:26,079: some memory  759.9s, avg 80.0%
2021-11-28 01:35:26,079: full memory  757.5s, avg 79.8%

mem:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/noswap/mem

psi:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/noswap/psi

- with swappiness=0

Summary:

2021-11-28 01:59:40,253: Stall times for the last 1127.8s:
2021-11-28 01:59:40,253: -----------
2021-11-28 01:59:40,253: some cpu     6.8s, avg 0.6%
2021-11-28 01:59:40,253: -----------
2021-11-28 01:59:40,253: some io      228.3s, avg 20.2%
2021-11-28 01:59:40,253: full io      218.7s, avg 19.4%
2021-11-28 01:59:40,253: -----------
2021-11-28 01:59:40,253: some memory  987.4s, avg 87.6%
2021-11-28 01:59:40,253: full memory  982.3s, avg 87.1%

mem log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/swappiness0/mem

psi log:
https://raw.githubusercontent.com/hakavlad/cache-tests/main/516-reclaim-throttle/516-rc2/patch3/swappiness0/psi

¯\_(ツ)_/¯

Should I test anything else?

[1] https://github.com/hakavlad/nohang
[2] https://raw.githubusercontent.com/hakavlad/nohang/master/src/psi2log
[3] https://github.com/hakavlad/mem2log
[4] https://lore.kernel.org/lkml/20211124115007.GG3366@techsingularity.net/
[5] https://lore.kernel.org/lkml/20211124143303.GH3366@techsingularity.net/
[6] https://lore.kernel.org/lkml/20211126162416.GK3366@techsingularity.net/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ