linux-kernel - Re: [PATCH v4 0/8] Remove dependency on congestion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-id: <163529540259.8576.9186192891154927096@noble.neil.brown.name>
Date:   Wed, 27 Oct 2021 11:43:22 +1100
From:   "NeilBrown" <neilb@...e.de>
To:     "Mel Gorman" <mgorman@...hsingularity.net>
Cc:     "Andrew Morton" <akpm@...ux-foundation.org>,
        "Theodore Ts'o" <tytso@....edu>,
        "Andreas Dilger" <adilger.kernel@...ger.ca>,
        "Darrick J . Wong" <djwong@...nel.org>,
        "Matthew Wilcox" <willy@...radead.org>,
        "Michal Hocko" <mhocko@...e.com>,
        "Dave Chinner" <david@...morbit.com>,
        "Rik van Riel" <riel@...riel.com>,
        "Vlastimil Babka" <vbabka@...e.cz>,
        "Johannes Weiner" <hannes@...xchg.org>,
        "Jonathan Corbet" <corbet@....net>,
        "Linux-MM" <linux-mm@...ck.org>,
        "Linux-fsdevel" <linux-fsdevel@...r.kernel.org>,
        "LKML" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 0/8] Remove dependency on congestion_wait in mm/

On Sat, 23 Oct 2021, Mel Gorman wrote:
> On Fri, Oct 22, 2021 at 10:26:30PM +1100, NeilBrown wrote:
> > On Fri, 22 Oct 2021, Mel Gorman wrote:
> > > On Fri, Oct 22, 2021 at 12:15:10PM +1100, NeilBrown wrote:
> > > 
> > > > In general, I still don't like the use of wake_up_all(), though it won't
> > > > cause incorrect behaviour.
> > > > 
> > > 
> > > Removing wake_up_all would be tricky.
> > 
> > I think there is a misunderstanding.  Removing wake_up_all() is as
> > simple as
> >    s/wake_up_all/wake_up/
> > 
> > If you used prepare_to_wait_exclusive(), then wake_up() would only wake
> > one waiter, while wake_up_all() would wake all of them.
> > As you use prepare_to_wait(), wake_up() will wake all waiters - as will
> > wake_up_all(). 
> > 
> 
> Ok, yes, there was a misunderstanding. I thought you were suggesting a
> move to exclusive wakeups. I felt that the wake_up_all was explicit in
> terms of intent and that I really meant for all tasks to wake instead of
> one at a time.

Fair enough.  Thanks for changing it :-)

But this prompts me to wonder if exclusive wakeups would be a good idea
- which is a useful springboard to try to understand the code better.

For VMSCAN_THROTTLE_ISOLATED they probably are.
One pattern for reliable exclusive wakeups is for any thread that
received a wake-up to then consider sending a wake up.

Two places receive VMSCAN_THROTTLE_ISOLATED wakeups and both then call
too_many_isolated() which - on success - sends another wakeup - before
the caller has had a chance to isolate anything.  If, instead, the
wakeup was sent sometime later, after pages were isolated by before the
caller (isoloate_migratepages_block() or shrink_inactive_list())
returned, then we would get an orderly progression of threads running
through that code.

For VMSCAN_THROTTLE_WRITEBACK is a little less straight forward.
There are two different places that wait for the wakeup, and a wake_up
is sent to all waiters after a time proportional to the number of
waiters.  It might make sense to wake one thread per time unit?
That might work well for do_writepages - every SWAP_CLUSTER_MAX writes
triggers one wakeup.
I'm less sure that it would work for shrink_node().  Maybe the
shrink_node() waiters could be non-exclusive so they get woken as soon a
SWAP_CLUSTER_MAX writes complete, while do_writepages are exclusive and
get woken one at a time.

For VMSCAN_THROTTLE_NOPROGRESS .... I don't understand.
If one zone isn't making "enough" progress, we throttle before moving on
to the next zone.  So we delay processing of the next zone, and only
indirectly delay re-processing of the current congested zone.
Maybe it make sense, but I don't see it yet.  I note that the commit
message says "it's messy".  I can't argue with that!

I'll follow up with patches to clarify what I am thinking about the
first two.  I'm not proposing the patches, just presenting them as part
of improving my understanding.

Thanks,
NeilBrown