[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGsJ_4xdvGjZ9YZnc0mk3bDfPCwxdpF_5bhcbca09j=-KBM9Mg@mail.gmail.com>
Date: Wed, 15 Oct 2025 04:42:36 +0800
From: Barry Song <21cnbao@...il.com>
To: Lei Liu <liulei.rjpt@...o.com>
Cc: Kairui Song <ryncsn@...il.com>, Michal Hocko <mhocko@...e.com>,
David Rientjes <rientjes@...gle.com>, Shakeel Butt <shakeel.butt@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>, Kemeng Shi <shikemeng@...weicloud.com>,
Nhat Pham <nphamcs@...il.com>, Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>,
Johannes Weiner <hannes@...xchg.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Muchun Song <muchun.song@...ux.dev>, David Hildenbrand <david@...hat.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>,
Brendan Jackman <jackmanb@...gle.com>, Zi Yan <ziy@...dia.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>, Chen Yu <yu.c.chen@...el.com>,
Hao Jia <jiahao1@...iang.com>, "Kirill A. Shutemov" <kas@...nel.org>,
Usama Arif <usamaarif642@...il.com>, Oleg Nesterov <oleg@...hat.com>,
Christian Brauner <brauner@...nel.org>, Mateusz Guzik <mjguzik@...il.com>,
Steven Rostedt <rostedt@...dmis.org>, Andrii Nakryiko <andrii@...nel.org>,
Al Viro <viro@...iv.linux.org.uk>, Fushuai Wang <wangfushuai@...du.com>,
"open list:MEMORY MANAGEMENT - OOM KILLER" <linux-mm@...ck.org>, open list <linux-kernel@...r.kernel.org>,
"open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)" <cgroups@...r.kernel.org>
Subject: Re: [PATCH v0 0/2] mm: swap: Gather swap entries and batch async release
>
> Hi Barry
>
> Thank you for your question. Here is the issue we are encountering:
>
> Flame graph of time distribution for douyin process exit (~400MB swapped):
> do_notify_resume 3.89%
> get_signal 3.89%
> do_signal_exit 3.88%
> do_exit 3.88%
> mmput 3.22%
> exit_mmap 3.22%
> unmap_vmas 3.08%
> unmap_page_range 3.07%
> free_swap_and_cache_nr 1.31%****
> swap_entry_range_free 1.17%****
> zram_slot_free_notify 1.11%****
If 1.11/1.31, or 85% of free_swap_and_cache_nr, comes from zram_free,
it’s clear that the swap/mm core is not the right place for this optimization.
As it involves too much complexity—for example, synchronization between
swapoff and your new threads.
> zram_free_hw_entry_dc 0.43%
> free_zspage[zsmalloc] 0.09%
Thanks
Barry
Powered by blists - more mailing lists