linux-kernel - Re: [External] Re: [PATCH] mm:zswap: fix zswap entry reclamation failure in two scenarios

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkbkPZ-Fiyz_4CKNQmufCpWSF330xK3bc7aHNML_cPi2sA@mail.gmail.com>
Date:   Wed, 15 Nov 2023 12:12:36 -0800
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     贺中坤 <hezhongkun.hzk@...edance.com>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org, nphamcs@...il.com,
        sjenning@...hat.com, ddstreet@...e.org, vitaly.wool@...sulko.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Ying <ying.huang@...el.com>
Subject: Re: [External] Re: [PATCH] mm:zswap: fix zswap entry reclamation
 failure in two scenarios

On Wed, Nov 15, 2023 at 4:53 AM 贺中坤 <hezhongkun.hzk@...edance.com> wrote:
>
> > For case (1), I think a cleaner solution would be to move the
> > zswap_invalidate() call from swap_range_free() (which is called after
> > the cached slots are freed) to __swap_entry_free_locked() if the usage
> > goes to 0. I actually think conceptually this makes not just for
> > zswap_invalidate(), but also for the arch call, memcg uncharging, etc.
> > Slots caching is a swapfile optimization that should be internal to
> > swapfile code. Once a swap entry is freed (i.e. swap count is 0 AND
> > not in the swap cache), all the hooks should be called (memcg, zswap,
> > arch, ..) as the swap entry is effectively freed. The fact that
> > swapfile code internally batches and caches slots should be
> > transparent to other parts of MM. I am not sure if the calls can just
> > be moved or if there are underlying assumptions in the implementation
> > that would be broken, but it feels like the right thing to do.
>
> Good idea,  This is indeed a clear solution.  I'll try it in another
> patch later.
>
> >
> > For case (2), I don't think zswap can just decide to free the entry.
> >
> > In that case, the page is in the swap cache pointing to a swp_entry
> > which has a corresponding zswap entry, and the page is clean because
> > it is already in swap/zswap, so we don't need to write it out again
> > unless it is redirtied. If zswap just drops the entry, and reclaim
> > tries to reclaim the page in the swap cache, it will drop the page
> > assuming that there is a copy in swap/zswap (because it is clean). Now
> > we lost all copies of the page.
> >
> > Am I missing something?
> >
>
> Ah, my bad.  Missed the step of marking the page as dirty.
> Please have a look,  just like zswap_exclusive_loads_enabled,
> set page dity so that it can be pageout again.
>        if (!page_was_allocated) {
>               if (!count) {
>                        set_page_dirty(page);
>                        ret = 0;
>                } else
>                        ret = -EEXIST;
>                put_page(page);
> }

I think we may need to try to lock the folio. Otherwise we may race
with reclaim reading the dirty bit before we set it.

Taking a step back, this seems like we are going behind exclusive
loads. We "should" keep the page in zswap as exclusive loads are
disabled and the page is not yet invalidated from zswap (the swap
entry is still in use). What you are trying to do here is sneakily
drop the page from zswap as if we wrote it back, but we didn't. We
just know that it was already loaded from zswap. We are essentially
making the previous load exclusive retroactively.

Is there a reason why exclusive loads cannot be enabled to achieve the
same result in the (arguably) correct way?

> Thanks  for your feedback, Yosry.