lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CLi1h3qbGYrZfXggbZYBkPthTt3GC0h6HCdOxSTO6-1qTVLA@mail.gmail.com>
Date:   Fri, 9 Jun 2023 13:05:53 +0200
From:   Domenico Cerasuolo <cerasuolodomenico@...il.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     vitaly.wool@...sulko.com, minchan@...nel.org,
        senozhatsky@...omium.org, yosryahmed@...gle.com,
        linux-mm@...ck.org, ddstreet@...e.org, sjenning@...hat.com,
        nphamcs@...il.com, akpm@...ux-foundation.org,
        linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [RFC PATCH v2 6/7] mm: zswap: simplify writeback function

On Thu, Jun 8, 2023 at 6:48 PM Johannes Weiner <hannes@...xchg.org> wrote:
>
> Hi Domenico,
>
> On Tue, Jun 06, 2023 at 04:56:10PM +0200, Domenico Cerasuolo wrote:
> > Previously, in zswap, the writeback function was passed to zpool drivers
> > for their usage in calling the writeback operation. However, since the
> > drivers did not possess references to entries and the function was
> > specifically designed to work with handles, the writeback process has
> > been modified to occur directly within zswap.
>
> I'm having trouble parsing this sentence.
>
> > Consequently, this change allows for some simplification of the
> > writeback function, taking into account the updated workflow.
>
> How about:
>
> zswap_writeback_entry() used to be a callback for the backends, which
> don't know about struct zswap_entry.
>
> Now that the only user is the generic zswap LRU reclaimer, it can be
> simplified: pass the pinned zswap_entry directly, and consolidate the
> refcount management in the shrink function.

Sounds clearer, will update.

>
> > Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@...il.com>
> > ---
> >  mm/zswap.c | 69 ++++++++++++++----------------------------------------
> >  1 file changed, 17 insertions(+), 52 deletions(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 2831bf56b168..ef8604812352 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -250,7 +250,8 @@ static bool zswap_has_pool;
> >       pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name,         \
> >                zpool_get_type((p)->zpool))
> >
> > -static int zswap_writeback_entry(struct zpool *pool, unsigned long handle);
> > +static int zswap_writeback_entry(struct zswap_entry *entry, struct zswap_header *zhdr,
> > +                              struct zswap_tree *tree);
> >  static int zswap_pool_get(struct zswap_pool *pool);
> >  static void zswap_pool_put(struct zswap_pool *pool);
> >
> > @@ -632,15 +633,21 @@ static int zswap_shrink(struct zswap_pool *pool)
> >       }
> >       spin_unlock(&tree->lock);
> >
> > -     ret = zswap_writeback_entry(pool->zpool, lru_entry->handle);
> > +     ret = zswap_writeback_entry(lru_entry, zhdr, tree);
> >
> >       spin_lock(&tree->lock);
> >       if (ret) {
> >               spin_lock(&pool->lru_lock);
> >               list_move(&lru_entry->lru, &pool->lru);
> >               spin_unlock(&pool->lru_lock);
>
> This could use a comment.
>
>                 /* Writeback failed, put entry back on LRU */
>
> > +             zswap_entry_put(tree, tree_entry);
>
> This put is a common factor in both branches, so you can consolidate.
>
> > +     } else {
> > +             /* free the local reference */
> > +             zswap_entry_put(tree, tree_entry);
> > +             /* free the entry if it's not been invalidated*/
>
> Missing space between text and */
>
> > +             if (lru_entry == zswap_rb_search(&tree->rbroot, swpoffset))
> > +                     zswap_entry_put(tree, tree_entry);
> >       }
> > -     zswap_entry_put(tree, tree_entry);
> >       spin_unlock(&tree->lock);
>
> The success path freeing (hopefully the common path) is now
> unfortunately hidden in fairly deep indentation. I think this would be
> better with a goto for the error case.
>
> All together, something like this?
>
>         if (ret) {
>                 /* Writeback failed, put entry back on LRU */
>                 ...
>                 goto put_unlock;
>         }
>
>         /*
>          * Writeback started successfully, the page now belongs to the
>          * swapcache. Drop the base ref from the tree - unless invalidate
>          * already took it out while we had the tree->lock released for IO.
>          */
>         if (lru_entry == zswap_rb_search(&tree->rb_root, swpoffset))
>                 zswap_entry_put(tree, entry);
>
> put_unlock:
>         /* Drop local reference */
>         zswap_entry_put(tree, tree_entry);
>         spin_unlock(&tree->lock);
>
>         return ret ? -EAGAIN : 0;
>

This feedback overlaps with the on in patch 1/7, I'm integrating them
so that in patch #1, the invalidation check is done only with rb search
and a first `goto unlock` for error.
Then here the base reference-drop is added after the `ret` check, and
errors go to `put_unlock`:

if (ret) {
        /* Writeback failed, put entry back on LRU */
        spin_lock(&pool->lru_lock);
        list_move(&entry->lru, &pool->lru);
        spin_unlock(&pool->lru_lock);
        goto put_unlock;
}

/* Check for invalidate() race */
if (entry != zswap_rb_search(&tree->rbroot, swpoffset))
        goto put_unlock;

        /* Drop base reference */
        zswap_entry_put(tree, entry);

put_unlock:
        /* Drop local reference */
        zswap_entry_put(tree, entry);
unlock:
        spin_unlock(&tree->lock);
return ret ? -EAGAIN : 0;

> Btw, it's unsettling that we drop the tree reference without
> explicitly removing the entry for the tree. We rely on the final put
> that frees the entry to do tree removal, but this doesn't happen if
> somebody else is holding a reference; the entry can remain in the tree
> long after we've officially handed over ownership of the data to
> swapcache. TTBOMK there are currently no bugs because of that, but
> it's a data corruption waiting to happen.
>
> This should be:
>
>         /*
>          * Writeback started successfully, the page now belongs to the
>          * swapcache. Drop the entry from zswap - unless invalidate already
>          * took it out while we had the tree->lock released for IO.
>          */
>         if (entry == zswap_rb_search(&tree->rb_root, swpoffset))
>                 zswap_invalidate_entry(tree, entry);
>
> Would you care to send a follow-up patch?

Makes total sense, thanks will send a patch for this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ