[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250728190248.605750-1-longman@redhat.com>
Date: Mon, 28 Jul 2025 15:02:48 -0400
From: Waiman Long <longman@...hat.com>
To: Catalin Marinas <catalin.marinas@....com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Waiman Long <longman@...hat.com>
Subject: [PATCH] mm/kmemleak: Avoid soft lockup in __kmemleak_do_cleanup()
A soft lockup warning was observed on a relative small system x86-64
system with 16 GB of memory when running a debug kernel with kmemleak
enabled.
watchdog: BUG: soft lockup - CPU#8 stuck for 33s! [kworker/8:1:134]
The test system was running a workload with hot unplug happening
in parallel. Then kemleak decided to disable itself due to its
inability to allocate more kmemleak objects. The debug kernel has its
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE set to 40,000.
The soft lockup happened in kmemleak_do_cleanup() when the existing
kmemleak objects were being removed and deleted one-by-one in a loop
via a workqueue. In this particular case, there are at least 40,000
objects that need to be processed and given the slowness of a debug
kernel and the fact that a raw_spinlock has to be acquired and released
in __delete_object(), it could take a while to properly handle all
these objects.
As kmemleak has been disabled in this case, the object removal and
deletion process can be further optimized as locking isn't really
needed. However, it is probably not worth the effort to optimize for
such an edge case that should rarely happen. So the simple solution is
to call cond_resched() at periodic interval in the iteration loop to
avoid soft lockup.
Signed-off-by: Waiman Long <longman@...hat.com>
---
mm/kmemleak.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 8d588e685311..620abd95e680 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -2181,6 +2181,7 @@ static const struct file_operations kmemleak_fops = {
static void __kmemleak_do_cleanup(void)
{
struct kmemleak_object *object, *tmp;
+ unsigned int cnt = 0;
/*
* Kmemleak has already been disabled, no need for RCU list traversal
@@ -2189,6 +2190,10 @@ static void __kmemleak_do_cleanup(void)
list_for_each_entry_safe(object, tmp, &object_list, object_list) {
__remove_object(object);
__delete_object(object);
+
+ /* Call cond_resched() once per 64 iterations to avoid soft lockup */
+ if (!(++cnt & 0x3f))
+ cond_resched();
}
}
--
2.50.0
Powered by blists - more mailing lists