[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c591405b-3034-4a55-8664-e0c8ea393b79@gmail.com>
Date: Mon, 5 Aug 2024 12:06:24 +0800
From: Alex Shi <seakeel@...il.com>
To: Vishal Moola <vishal.moola@...il.com>, alexs@...nel.org
Cc: Vitaly Wool <vitaly.wool@...sulko.com>, Miaohe Lin
<linmiaohe@...wei.com>, Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, minchan@...nel.org,
willy@...radead.org, senozhatsky@...omium.org, david@...hat.com,
42.hyeyoo@...il.com, Yosry Ahmed <yosryahmed@...gle.com>, nphamcs@...il.com
Subject: Re: [PATCH v4 01/22] mm/zsmalloc: add zpdesc memory descriptor for
zswap.zpool
On 8/3/24 2:52 AM, Vishal Moola wrote:
> On Mon, Jul 29, 2024 at 07:25:13PM +0800, alexs@...nel.org wrote:
>> From: Alex Shi <alexs@...nel.org>
>
> I've been busy with other things, so I haven't been able to review this
> until now. Thanks to both you and Hyeonggon for working on this memdesc :)
Hi Vishal,
Thank a lot for your comments!
My pleasure! :)
>
>> The 1st patch introduces new memory decriptor zpdesc and rename
>> zspage.first_page to zspage.first_zpdesc, no functional change.
>>
>> We removed PG_owner_priv_1 since it was moved to zspage after
>> commit a41ec880aa7b ("zsmalloc: move huge compressed obj from
>> page to zspage").
>>
>> And keep the memcg_data member, since as Yosry pointed out:
>> "When the pages are freed, put_page() -> folio_put() -> __folio_put() will call
>> mem_cgroup_uncharge(). The latter will call folio_memcg() (which reads
>> folio->memcg_data) to figure out if uncharging needs to be done.
>>
>> There are also other similar code paths that will check
>> folio->memcg_data. It is currently expected to be present for all
>> folios. So until we have custom code paths per-folio type for
>> allocation/freeing/etc, we need to keep folio->memcg_data present and
>> properly initialized."
>>
>> Originally-by: Hyeonggon Yoo <42.hyeyoo@...il.com>
>> Signed-off-by: Alex Shi <alexs@...nel.org>
>> ---
>> mm/zpdesc.h | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++
>> mm/zsmalloc.c | 21 ++++++++--------
>> 2 files changed, 76 insertions(+), 11 deletions(-)
>> create mode 100644 mm/zpdesc.h
>>
>> diff --git a/mm/zpdesc.h b/mm/zpdesc.h
>> new file mode 100644
>> index 000000000000..2dbef231f616
>> --- /dev/null
>> +++ b/mm/zpdesc.h
>> @@ -0,0 +1,66 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* zpdesc.h: zswap.zpool memory descriptor
>> + *
>> + * Written by Alex Shi <alexs@...nel.org>
>> + * Hyeonggon Yoo <42.hyeyoo@...il.com>
>> + */
>> +#ifndef __MM_ZPDESC_H__
>> +#define __MM_ZPDESC_H__
>> +
>> +/*
>> + * struct zpdesc - Memory descriptor for zpool memory, now is for zsmalloc
>> + * @flags: Page flags, PG_private: identifies the first component page
>> + * @lru: Indirectly used by page migration
>> + * @mops: Used by page migration
>> + * @next: Next zpdesc in a zspage in zsmalloc zpool
>> + * @handle: For huge zspage in zsmalloc zpool
>> + * @zspage: Pointer to zspage in zsmalloc
>> + * @memcg_data: Memory Control Group data.
>> + *
>
> I think its a good idea to include comments for the padding (namely what
> aliases with it in struct page) here as well. It doesn't hurt, and will
> make them easier to remove in the future.
>
>> + * This struct overlays struct page for now. Do not modify without a good
>> + * understanding of the issues.
>> + */
>> +struct zpdesc {
>> + unsigned long flags;
>> + struct list_head lru;
>> + struct movable_operations *mops;
>> + union {
>> + /* Next zpdescs in a zspage in zsmalloc zpool */
>> + struct zpdesc *next;
>> + /* For huge zspage in zsmalloc zpool */
>> + unsigned long handle;
>> + };
>> + struct zspage *zspage;
>
> I like using pointers here, although I think the comments should be more
> precise about what the purpose of the pointer is. Maybe something like
> "Points to the zspage this zpdesc is a part of" or something.
I will change the comments for this member. Thanks!
>
>> + unsigned long _zp_pad_1;
>> +#ifdef CONFIG_MEMCG
>> + unsigned long memcg_data;
>> +#endif
>> +};
>
> You should definitely fold your additions to the struct from patch 17
> into this patch. It makes it easier to review, and better for anyone
> looking at the commit log in the future.
Thanks! I will move the struct part from patch 17 here.
>
>> +#define ZPDESC_MATCH(pg, zp) \
>> + static_assert(offsetof(struct page, pg) == offsetof(struct zpdesc, zp))
>> +
>> +ZPDESC_MATCH(flags, flags);
>> +ZPDESC_MATCH(lru, lru);
>> +ZPDESC_MATCH(mapping, mops);
>> +ZPDESC_MATCH(index, next);
>> +ZPDESC_MATCH(index, handle);
>> +ZPDESC_MATCH(private, zspage);
>> +#ifdef CONFIG_MEMCG
>> +ZPDESC_MATCH(memcg_data, memcg_data);
>> +#endif
>> +#undef ZPDESC_MATCH
>> +static_assert(sizeof(struct zpdesc) <= sizeof(struct page));
>> +
>> +#define zpdesc_page(zp) (_Generic((zp), \
>> + const struct zpdesc *: (const struct page *)(zp), \
>> + struct zpdesc *: (struct page *)(zp)))
>> +
>> +#define zpdesc_folio(zp) (_Generic((zp), \
>> + const struct zpdesc *: (const struct folio *)(zp), \
>> + struct zpdesc *: (struct folio *)(zp)))
>> +
>> +#define page_zpdesc(p) (_Generic((p), \
>> + const struct page *: (const struct zpdesc *)(p), \
>> + struct page *: (struct zpdesc *)(p)))
>> +
>> +#endif
>
> I'm don't think we need both page and folio cast functions for zpdescs.
> Sticking to pages will probably suffice (and be easiest) since all APIs
> zsmalloc cares about are already defined.
>
> We can stick to 1 "middle-man" descriptor for zpdescs since zsmalloc
> uses those pages as space to track zspages and nothing more. We'll likely
> end up completely removing it from zsmalloc once we can allocate
> memdescs on their own: It seems most (if not all) of the "indirect" members
> of zpdesc are used as indicators to the rest of core-mm telling them not to
> mess with that memory.
Yes, that is also my first attempt to skip folio part, but I found we could got
6.3% object size reduced on zsmalloc.o file, from 37.2KB to 34.9KB, if we use
folio series lock and folio_get/put functions. That saving come from compound_head
check skipping.
So I wrapped them carefully in zpdesc series functions in zpdesc.h file.
They should be easy replaced when we use memdescs in the future. Could we keep them
a while, or ?
Thanks
Alex
Powered by blists - more mailing lists