[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <11842dc2.2720.198b2647df8.Coremail.yangshiguang1011@163.com>
Date: Sat, 16 Aug 2025 18:19:47 +0800 (CST)
From: yangshiguang <yangshiguang1011@....com>
To: "Giorgi Tchankvetadze" <giorgitchankvetadze1997@...il.com>
Cc: "Harry Yoo" <harry.yoo@...cle.com>, vbabka@...e.cz,
akpm@...ux-foundation.org, cl@...two.org, rientjes@...gle.com,
roman.gushchin@...ux.dev, glittao@...il.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org,
yangshiguang <yangshiguang@...omi.com>
Subject: Re:Re: [PATCH v2] mm: slub: avoid wake up kswapd in
set_track_prepare
At 2025-08-16 17:33:00, "Giorgi Tchankvetadze" <giorgitchankvetadze1997@...il.com> wrote:
>Rather than masking to GFP_NOWAIT—which still allows kswapd to be
>woken—let’s strip every reclaim bit (`__GFP_RECLAIM` and
>`__GFP_KSWAPD_RECLAIM`) and add `__GFP_NORETRY | __GFP_NOWARN`. That
>guarantees we never enter the slow path that calls `wakeup_kswapd()`, so
>the timer-base lock can’t be re-entered.
>
Hi Giorgi,
Thanks for your advice.
But more scenarios allow kswapd to be woken up.
It might be better to allow kswapd to be woken up.
>On 8/16/2025 12:25 PM, Harry Yoo wrote:
>> On Thu, Aug 14, 2025 at 07:16:42PM +0800, yangshiguang1011@....com wrote:
>>> From: yangshiguang <yangshiguang@...omi.com>
>>>
>>> From: yangshiguang <yangshiguang@...omi.com>
>>>
>>> set_track_prepare() can incur lock recursion.
>>> The issue is that it is called from hrtimer_start_range_ns
>>> holding the per_cpu(hrtimer_bases)[n].lock, but when enabled
>>> CONFIG_DEBUG_OBJECTS_TIMERS, may wake up kswapd in set_track_prepare,
>>> and try to hold the per_cpu(hrtimer_bases)[n].lock.
>>>
>>> So avoid waking up kswapd.The oops looks something like:
>>
>> Hi yangshiguang,
>>
>> In the next revision, could you please elaborate the commit message
>> to reflect how this change avoids waking up kswapd?
>>
>>> BUG: spinlock recursion on CPU#3, swapper/3/0
>>> lock: 0xffffff8a4bf29c80, .magic: dead4ead, .owner: swapper/3/0, .owner_cpu: 3
>>> Hardware name: Qualcomm Technologies, Inc. Popsicle based on SM8850 (DT)
>>> Call trace:
>>> spin_bug+0x0
>>> _raw_spin_lock_irqsave+0x80
>>> hrtimer_try_to_cancel+0x94
>>> task_contending+0x10c
>>> enqueue_dl_entity+0x2a4
>>> dl_server_start+0x74
>>> enqueue_task_fair+0x568
>>> enqueue_task+0xac
>>> do_activate_task+0x14c
>>> ttwu_do_activate+0xcc
>>> try_to_wake_up+0x6c8
>>> default_wake_function+0x20
>>> autoremove_wake_function+0x1c
>>> __wake_up+0xac
>>> wakeup_kswapd+0x19c
>>> wake_all_kswapds+0x78
>>> __alloc_pages_slowpath+0x1ac
>>> __alloc_pages_noprof+0x298
>>> stack_depot_save_flags+0x6b0
>>> stack_depot_save+0x14
>>> set_track_prepare+0x5c
>>> ___slab_alloc+0xccc
>>> __kmalloc_cache_noprof+0x470
>>> __set_page_owner+0x2bc
>>> post_alloc_hook[jt]+0x1b8
>>> prep_new_page+0x28
>>> get_page_from_freelist+0x1edc
>>> __alloc_pages_noprof+0x13c
>>> alloc_slab_page+0x244
>>> allocate_slab+0x7c
>>> ___slab_alloc+0x8e8
>>> kmem_cache_alloc_noprof+0x450
>>> debug_objects_fill_pool+0x22c
>>> debug_object_activate+0x40
>>> enqueue_hrtimer[jt]+0xdc
>>> hrtimer_start_range_ns+0x5f8
>>> ...
>>>
>>> Signed-off-by: yangshiguang <yangshiguang@...omi.com>
>>> Fixes: 5cf909c553e9 ("mm/slub: use stackdepot to save stack trace in objects")
>>> ---
>>> v1 -> v2:
>>> propagate gfp flags to set_track_prepare()
>>>
>>> [1] https://lore.kernel.org/all/20250801065121.876793-1-yangshiguang1011@163.com
>>> ---
>>> mm/slub.c | 21 +++++++++++----------
>>> 1 file changed, 11 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index 30003763d224..dba905bf1e03 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -962,19 +962,20 @@ static struct track *get_track(struct kmem_cache *s, void *object,
>>> }
>>>
>>> #ifdef CONFIG_STACKDEPOT
>>> -static noinline depot_stack_handle_t set_track_prepare(void)
>>> +static noinline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags)
>>> {
>>> depot_stack_handle_t handle;
>>> unsigned long entries[TRACK_ADDRS_COUNT];
>>> unsigned int nr_entries;
>>> + gfp_flags &= GFP_NOWAIT;
>>
>> Is there any reason to downgrade it to GFP_NOWAIT when the gfp flag allows
>> direct reclamation?
>>
>>> nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 3);
>>> - handle = stack_depot_save(entries, nr_entries, GFP_NOWAIT);
>>> + handle = stack_depot_save(entries, nr_entries, gfp_flags);
>>>
>>> return handle;
>>> }
>>> #else
>>> -static inline depot_stack_handle_t set_track_prepare(void)
>>> +static inline depot_stack_handle_t set_track_prepare(gfp_t gfp_flags)
>>> {
>>> return 0;
>>> }
>>> @@ -4422,7 +4423,7 @@ static noinline void free_to_partial_list(
>>> depot_stack_handle_t handle = 0;
>>>
>>> if (s->flags & SLAB_STORE_USER)
>>> - handle = set_track_prepare();
>>> + handle = set_track_prepare(GFP_NOWAIT);
>>
>> I don't think it is safe to use GFP_NOWAIT during free?
>>
>> Let's say fill_pool() -> kmem_alloc_batch() fails to allocate an object
>> and then free_object_list() frees allocated objects,
>> set_track_prepare(GFP_NOWAIT) may wake up kswapd, and the same deadlock
>> you reported will occur.
>>
>> So I think it should be __GFP_NOWARN?
>>
Powered by blists - more mailing lists