linux-kernel - Re: [PATCH] mm: zswap: shrink until can accept

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230530155519.GB97194@cmpxchg.org>
Date:   Tue, 30 May 2023 11:55:19 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Chris Li <chrisl@...nel.org>
Cc:     Domenico Cerasuolo <cerasuolodomenico@...il.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        sjenning@...hat.com, ddstreet@...e.org, vitaly.wool@...sulko.com,
        yosryahmed@...gle.com, kernel-team@...com
Subject: Re: [PATCH] mm: zswap: shrink until can accept

On Tue, May 30, 2023 at 07:51:23AM -0700, Chris Li wrote:
> Thanks for pointing out -ENOMEM shouldn't be persistent.
> Points taken.
> 
> The original point of not retrying the persistent error
> still holds.

Okay, but what persistent errors are you referring to?

Aside from -ENOMEM, writeback_entry will fail on concurrent swap
invalidation or a racing swapin fault. In both cases we should
absolutely keep trying other entries until the goal is met.

> > Should it be fixed before merging this patch? I don't think the
> > ordering matters. Right now the -ENOMEM case invokes OOM, so it isn't
> > really persistent either. Retrying a few times in that case certainly
> > doesn't seem to make things worse.
> 
> If you already know the error is persistent, retrying is wasting
> CPU. It can pertancial hold locks during the retry, which can
> slow someone else down.

That's a bit of a truism. How does this pertain to the zswap reclaim
situation?

> > > > As I was writing to Yosry, the differentiation would be a great improvement
> > > > here, I just have a patch set in the queue that moves the inner reclaim loop
> > > > from the zpool driver up to zswap. With that, updating the error handling
> > > > would be more convenient as it would be done in one place instead of three.i
> > > 
> > > This has tricky complications as well. The current shrink interface
> > > doesn't support continuing from the previous error position. If you want
> > > to avoid a repeat attempt if the page has a writeback error, you kinda
> > > of need a way to skip that page.
> > 
> > A page that fails to reclaim is put back to the tail of the LRU, so
> > for all intents and purposes it will be skipped. In the rare and
> 
> Do you mean the page is treated as hot again?
> 
> Wouldn't that be undesirable from the app's point of view?

That's current backend LRU behavior. Is it optimal? That's certainly
debatable. But it's tangential to this patch. The point is that
capping retries to a fixed number of failures works correctly as a
safety precaution and introduces no (new) undesirable behavior.

It's entirely moot once we refactor the backend page LRU to the zswap
entry LRU. The only time we'll fail to reclaim an entry is if we race
with something already freeing it, so it doesn't really matter where
we put it.

> > extreme case where it's the only page left on the list, I again don't
> > see how retrying a few times will make the situation worse.
> > 
> > In practice, IMO there is little upside in trying to be more
> > discerning about the error codes. Simple seems better here.
> 
> Just trying to think about what should be the precise loop termination
> condition here.
> 
> I still feel blindly trying a few times is a very imprecise condition.

The precise termination condition is when can_accept() returns true
again. The safety cap is only added as precaution to avoid infinite
loops if something goes wrong or unexpected, now or in the future.