linux-kernel - Re: [PATCH v2] zsmalloc: prefer the the original page's node for compressed data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKEwX=NJi63CPEe2bXD4_+Uur2KUmCV-L1kVtQr_+mgKbMb3eg@mail.gmail.com>
Date: Thu, 3 Apr 2025 08:05:57 -0700
From: Nhat Pham <nphamcs@...il.com>
To: Sergey Senozhatsky <senozhatsky@...omium.org>
Cc: akpm@...ux-foundation.org, hannes@...xchg.org, yosry.ahmed@...ux.dev, 
	chengming.zhou@...ux.dev, sj@...nel.org, linux-mm@...ck.org, 
	kernel-team@...a.com, linux-kernel@...r.kernel.org, gourry@...rry.net, 
	ying.huang@...ux.alibaba.com, jonathan.cameron@...wei.com, 
	dan.j.williams@...el.com, linux-cxl@...r.kernel.org, minchan@...nel.org
Subject: Re: [PATCH v2] zsmalloc: prefer the the original page's node for
 compressed data

On Wed, Apr 2, 2025 at 8:55 PM Sergey Senozhatsky
<senozhatsky@...omium.org> wrote:
>
> On (25/04/02 13:44), Nhat Pham wrote:
> [..]
> > @@ -1981,10 +1981,15 @@ static int recompress_slot(struct zram *zram, u32 index, struct page *page,
> >        * We are holding per-CPU stream mutex and entry lock so better
> >        * avoid direct reclaim.  Allocation error is not fatal since
> >        * we still have the old object in the mem_pool.
> > +      *
> > +      * XXX: technically, the node we really want here is the node that holds
> > +      * the original compressed data. But that would require us to modify
> > +      * zsmalloc API to return this information. For now, we will make do with
> > +      * the node of the page allocated for recompression.
> >        */
> >       handle_new = zs_malloc(zram->mem_pool, comp_len_new,
> >                              GFP_NOIO | __GFP_NOWARN |
> > -                            __GFP_HIGHMEM | __GFP_MOVABLE);
> > +                            __GFP_HIGHMEM | __GFP_MOVABLE, page_to_nid(page));
>
> This works for me.  I saw in other threads (and sorry, I'm on a sick
> leave now, can't reply fast) that "zram people need to fix it"/etc.
> I guess I'm one of those zram people and I don't think I run any
> multi-node NUMA systems, so that problem probably has never occurred
> to me.  We can add a simple API extension to lookup node-id by
> zsmalloc handle, if anyone really needs it, but I'm okay with the
> status quo (and XXX) for the time being.

Yeah tbh this path is not that much different from the status quo
(page is also allocated via the task's policy here), and all the other
paths are fixed, so this seems to be an overall improvement still :)

If someone uses recompress + multi node, we can fix it then. It's not
impossible, just more code :)

>
> [..]
> > -unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp)
> > +unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp,
> > +                             const int nid)
> >  {
>
> I'm not the zspool maintainer, but if I may ask, for new zsmalloc code my
> preference is to follow the kernel styles (yes, the old code doesn't follow
> any at all, whenever I touch the old code I fix it).
>
> Do you mind if we do something like below? (at least for zsmalloc)

I don't mind at all! Thanks for fixing it, Sergey.

I saw that Andrew already added this as a fixlet. Thanks, Andrew!

>
> With this
> Acked-by: Sergey Senozhatsky <senozhatsky@...omium.org> #zram and zsmalloc
>
>
> ---
>
> diff --git a/include/linux/zpool.h b/include/linux/zpool.h
> index 697525cb00bd..369ef068fad8 100644
> --- a/include/linux/zpool.h
> +++ b/include/linux/zpool.h
> @@ -22,7 +22,7 @@ const char *zpool_get_type(struct zpool *pool);
>  void zpool_destroy_pool(struct zpool *pool);
>
>  int zpool_malloc(struct zpool *pool, size_t size, gfp_t gfp,
> -                       unsigned long *handle, const int nid);
> +                unsigned long *handle, const int nid);
>
>  void zpool_free(struct zpool *pool, unsigned long handle);
>
> @@ -64,7 +64,7 @@ struct zpool_driver {
>         void (*destroy)(void *pool);
>
>         int (*malloc)(void *pool, size_t size, gfp_t gfp,
> -                               unsigned long *handle, const int nid);
> +                     unsigned long *handle, const int nid);
>         void (*free)(void *pool, unsigned long handle);
>
>         void *(*obj_read_begin)(void *pool, unsigned long handle,
> diff --git a/mm/zpool.c b/mm/zpool.c
> index b99a7c03e735..0a71d03369f1 100644
> --- a/mm/zpool.c
> +++ b/mm/zpool.c
> @@ -239,7 +239,7 @@ const char *zpool_get_type(struct zpool *zpool)
>   * Returns: 0 on success, negative value on error.
>   */
>  int zpool_malloc(struct zpool *zpool, size_t size, gfp_t gfp,
> -                       unsigned long *handle, const int nid)
> +                unsigned long *handle, const int nid)
>  {
>         return zpool->driver->malloc(zpool->pool, size, gfp, handle, nid);
>  }
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index daa16ce4cf07..70406ac94bbd 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -452,7 +452,7 @@ static void zs_zpool_destroy(void *pool)
>  }
>
>  static int zs_zpool_malloc(void *pool, size_t size, gfp_t gfp,
> -                       unsigned long *handle, const int nid)
> +                          unsigned long *handle, const int nid)
>  {
>         *handle = zs_malloc(pool, size, gfp, nid);
>
> @@ -1033,8 +1033,8 @@ static void create_page_chain(struct size_class *class, struct zspage *zspage,
>   * Allocate a zspage for the given size class
>   */
>  static struct zspage *alloc_zspage(struct zs_pool *pool,
> -                                       struct size_class *class,
> -                                       gfp_t gfp, const int nid)
> +                                  struct size_class *class,
> +                                  gfp_t gfp, const int nid)
>  {
>         int i;
>         struct zpdesc *zpdescs[ZS_MAX_PAGES_PER_ZSPAGE];
> @@ -1333,7 +1333,7 @@ static unsigned long obj_malloc(struct zs_pool *pool,
>   * Allocation requests with size > ZS_MAX_ALLOC_SIZE will fail.
>   */
>  unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp,
> -                               const int nid)
> +                       const int nid)
>  {
>         unsigned long handle;
>         struct size_class *class;