linux-kernel - Re: [RFC PATCH] mm/vmscan: fix high cpu usage of kswapd if there

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170223101901.tr2j7d3p6vt55knn@dhcp22.suse.cz>
Date:   Thu, 23 Feb 2017 11:19:03 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Jia He <hejianet@...il.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Vlastimil Babka <vbabka@...e.cz>,
        Minchan Kim <minchan@...nel.org>,
        Rik van Riel <riel@...hat.com>
Subject: Re: [RFC PATCH] mm/vmscan: fix high cpu usage of kswapd if there

On Wed 22-02-17 15:16:57, Johannes Weiner wrote:
[...]
> Can we simply count the number of balance_pgdat() runs that didn't
> reclaim anything and have kswapd sleep after MAX_RECLAIM_RETRIES?
> 
> And a follow-up: once it gives up, when should kswapd return to work?
> We used to reset NR_PAGES_SCANNED whenever a page gets freed. But
> that's a branch in a common allocator path, just to recover kswapd - a
> latency tool, not a necessity for functional correctness - from a
> situation that's exceedingly pretty rare. How about we leave it
> disabled until a direct reclaimer manages to free something?

Yes, this makes sense to me and it looks much better than the proposed
solution here. There some theoretical corner cases, like heavy metadata
and GFP_NOFS workload which wouldn't be able to reclaim from FS
shrinkers and kspwad would be really helpful at that time. But that
would need a general solution on its own.

I also welcome removing NR_PAGES_SCANNED, because this was just too
ephemeral to be actually useful when debugging the reclaim behavior.
I think we can accomplish much more by existing tracepoints. I would
just split that up in a separate follow up patch.

Thanks!
-- 
Michal Hocko
SUSE Labs