lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200225171242.GA496421@sultan-box.localdomain>
Date:   Tue, 25 Feb 2020 09:12:42 -0800
From:   Sultan Alsawaf <sultan@...neltoast.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Mel Gorman <mgorman@...e.de>, Dave Hansen <dave.hansen@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Johannes Weiner <hannes@...xchg.org>
Subject: Re: [PATCH] mm: Stop kswapd early when nothing's waiting for it to
 free pages

On Tue, Feb 25, 2020 at 10:09:45AM +0100, Michal Hocko wrote:
> On Fri 21-02-20 13:08:24, Sultan Alsawaf wrote:
> [...]
> > Both of these logs are attached in a tarball.
> 
> Thanks! First of all
> $ grep pswp vmstat.1582318979
> pswpin 0
> pswpout 0
> 
> suggests that you do not have any swap storage, right?

Correct. I'm not using any swap (and it should not be necessary to make Linux mm
work of course). If I were to divide my RAM in half and use one half as swap,
do you think the results would be different? IMO they shouldn't be.

> The amount of anonymous memory is not really high (~560MB) but file LRU
> is _really_ low (~3MB), unevictable list is at ~200MB. That gets us to
> ~760M of memory which is 74% of the memory. Please note that your mem=2G
> setup gives you only 1G of memory in fact (based on the zone_info you
> have posted). That is not something unusual but the amount of the page
> cache is worrying because I would expect a heavy trashing because most
> of the executables are going to require major faults. Anonymous memory
> is not swapped out obviously so there is no other option than to refault
> constantly.

I noticed that only 1G was available as well. Perhaps direct reclaim wasn't
attempted due to the zone_reclaimable_pages() check, though I don't think direct
reclaim would've been particularly helpful in this case (see below).

> kswapd has some feedback mechanism to back off when the zone is hopless
> from the reclaim point of view AFAIR but it seems it has failed in this
> particular situation. It should have relied on the direct reclaim and
> eventually trigger the OOM killer. Your patch has worked around this by
> bailing out from the kswapd reclaim too early so a part of the page
> cache required for the code to move on would stay resident and move
> further.
> 
> The proper fix should, however, check the amount of reclaimable pages
> and back off if they cannot meet the target IMO. We cannot rely on the
> general reclaimability here because that could really be thrashing.

Yes, my guess was that thrashing out pages used by the running programs was the
cause for my freezes, but I didn't think of making kswapd back off a different
way.

Right now I don't see any such back-off mechanism in kswapd. Also, if we add
this into kswapd, we would need to plug it into the direct reclaim path as well,
no? I don't think direct reclaim would help with the situation I've run into;
although it wouldn't be as bad as letting kswapd evict pages to the high
watermark, it would still cause page thrashing that would just be capped to the
amount of pages a direct reclaimer is looking to steal.

Considering that my patch remedies this issue for me without invoking the OOM
killer, a proper solution should produce the same or better results. I don't
think the OOM killer should have been triggered in this case.

Sultan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ