lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170228214007.5621-1-hannes@cmpxchg.org>
Date:   Tue, 28 Feb 2017 16:39:58 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Jia He <hejianet@...il.com>, Michal Hocko <mhocko@...e.cz>,
        Mel Gorman <mgorman@...e.de>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel-team@...com
Subject: [PATCH 0/9] mm: kswapd spinning on unreclaimable nodes - fixes and cleanups

Hi,

Jia reported a scenario in which the kswapd of a node indefinitely
spins at 100% CPU usage. We have seen similar cases at Facebook.

The kernel's current method of judging its ability to reclaim a node
(or whether to back off and sleep) is based on the amount of scanned
pages in proportion to the amount of reclaimable pages. In Jia's and
our scenarios, there are no reclaimable pages in the node, however,
and the condition for backing off is never met. Kswapd busyloops in an
attempt to restore the watermarks while having nothing to work with.

This series reworks the definition of an unreclaimable node based not
on scanning but on whether kswapd is able to actually reclaim pages in
MAX_RECLAIM_RETRIES (16) consecutive runs. This is the same criteria
the page allocator uses for giving up on direct reclaim and invoking
the OOM killer. If it cannot free any pages, kswapd will go to sleep
and leave further attempts to direct reclaim invocations, which will
either make progress and re-enable kswapd, or invoke the OOM killer.

Patch #1 fixes the immediate problem Jia reported, the remainder are
smaller fixlets, cleanups, and overall phasing out of the old method.

Patch #6 is the odd one out. It's a nice cleanup to get_scan_count(),
and directly related to #5, but in itself not relevant to the series.

If the whole series is too ambitious for 4.11, I would consider the
first three patches fixes, the rest cleanups.

Thanks

 include/linux/mmzone.h |   3 +-
 mm/internal.h          |   7 +-
 mm/migrate.c           |   3 -
 mm/page_alloc.c        |  39 +++--------
 mm/vmscan.c            | 169 ++++++++++++++++++-----------------------------
 mm/vmstat.c            |  24 ++-----
 6 files changed, 89 insertions(+), 156 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ