linux-kernel - [RFC PATCH 0/5] Remove dependency on congestion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20210920085436.20939-1-mgorman@techsingularity.net>
Date:   Mon, 20 Sep 2021 09:54:31 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Linux-MM <linux-mm@...ck.org>
Cc:     NeilBrown <neilb@...e.de>, Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        "Darrick J . Wong" <djwong@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        Dave Chinner <david@...morbit.com>,
        Rik van Riel <riel@...riel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>
Subject: [RFC PATCH 0/5] Remove dependency on congestion_wait in mm/

Cc list similar to "congestion_wait() and GFP_NOFAIL" as they're loosely
related.

This is a prototype series that removes all calls to congestion_wait
in mm/ and deletes wait_iff_congested. It's not a clever
implementation but congestion_wait has been broken for a long time
(https://lore.kernel.org/linux-mm/45d8b7a6-8548-65f5-cccf-9f451d4ae3d4@kernel.dk/).
Even if it worked, it was never a great idea. While excessive
dirty/writeback pages at the tail of the LRU is one possibility that
reclaim may be slow, there is also the problem of too many pages being
isolated and reclaim failing for other reasons (elevated references,
too many pages isolated, excessive LRU contention etc).

This series replaces the reclaim conditions with event driven ones

o If there are too many dirty/writeback pages, sleep until a timeout
  or enough pages get cleaned
o If too many pages are isolated, sleep until enough isolated pages
  are either reclaimed or put back on the LRU
o If no progress is being made, let direct reclaim tasks sleep until
  another task makes progress

This has been lightly tested only and the testing was useless as the
relevant code was not executed. The workload configurations I had that
used to trigger these corner cases no longer work (yey?) and I'll need
to implement a new synthetic workload. If someone is aware of a realistic
workload that forces reclaim activity to the point where reclaim stalls
then kindly share the details.

-- 
2.31.1

Mel Gorman (5):
  mm/vmscan: Throttle reclaim until some writeback completes if
    congested
  mm/vmscan: Throttle reclaim and compaction when too may pages are
    isolated
  mm/vmscan: Throttle reclaim when no progress is being made
  mm/writeback: Throttle based on page writeback instead of congestion
  mm/page_alloc: Remove the throttling logic from the page allocator

 include/linux/backing-dev.h      |   1 -
 include/linux/mmzone.h           |  12 ++++
 include/trace/events/vmscan.h    |  38 +++++++++++
 include/trace/events/writeback.h |   7 --
 mm/backing-dev.c                 |  48 --------------
 mm/compaction.c                  |   2 +-
 mm/filemap.c                     |   1 +
 mm/internal.h                    |  11 ++++
 mm/memcontrol.c                  |  10 +--
 mm/page-writeback.c              |  11 +++-
 mm/page_alloc.c                  |  26 ++------
 mm/vmscan.c                      | 110 ++++++++++++++++++++++++++++---
 mm/vmstat.c                      |   1 +
 13 files changed, 180 insertions(+), 98 deletions(-)

-- 
2.31.1