linux-kernel - [PATCH AUTOSEL 6.17-6.1] ftrace: Fix softlockup in ftrace_module

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251025160905.3857885-385-sashal@kernel.org>
Date: Sat, 25 Oct 2025 12:00:16 -0400
From: Sasha Levin <sashal@...nel.org>
To: patches@...ts.linux.dev,
	stable@...r.kernel.org
Cc: Vladimir Riabchun <ferr.lambarginio@...il.com>,
	"Steven Rostedt (Google)" <rostedt@...dmis.org>,
	Sasha Levin <sashal@...nel.org>,
	mhiramat@...nel.org,
	linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.1] ftrace: Fix softlockup in ftrace_module_enable

From: Vladimir Riabchun <ferr.lambarginio@...il.com>

[ Upstream commit 4099b98203d6b33d990586542fa5beee408032a3 ]

A soft lockup was observed when loading amdgpu module.
If a module has a lot of tracable functions, multiple calls
to kallsyms_lookup can spend too much time in RCU critical
section and with disabled preemption, causing kernel panic.
This is the same issue that was fixed in
commit d0b24b4e91fc ("ftrace: Prevent RCU stall on PREEMPT_VOLUNTARY
kernels") and commit 42ea22e754ba ("ftrace: Add cond_resched() to
ftrace_graph_set_hash()").

Fix it the same way by adding cond_resched() in ftrace_module_enable.

Link: https://lore.kernel.org/aMQD9_lxYmphT-up@vova-pc
Signed-off-by: Vladimir Riabchun <ferr.lambarginio@...il.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@...dmis.org>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---

LLM Generated explanations, may be completely bogus:

YES
- The change inserts `cond_resched()` inside the inner iteration over
  every ftrace record (`kernel/trace/ftrace.c:7538`). That loop holds
  the ftrace mutex and, for each record, invokes heavy helpers like
  `test_for_valid_rec()` which in turn calls `kallsyms_lookup()`
  (`kernel/trace/ftrace.c:4289`). On huge modules (e.g. amdgpu) this can
  run for tens of milliseconds with preemption disabled, triggering the
  documented soft lockup/panic during module load.
- `ftrace_module_enable()` runs only in process context via
  `prepare_coming_module()` (`kernel/module/main.c:3279`), so adding a
  voluntary reschedule point is safe; the same pattern already exists in
  other long-running ftrace loops (see commits d0b24b4e91fc and
  42ea22e754ba), so this brings consistency without changing control
  flow or semantics.
- No data structures or interfaces change, and the code still executes
  under the same locking (`ftrace_lock`, `text_mutex` when the arch
  overrides `ftrace_arch_code_modify_prepare()`), so the risk of
  regression is minimal: the new call simply yields CPU if needed while
  keeping the locks held, preventing watchdog-induced crashes but
  otherwise behaving identically.

Given it fixes a real, user-visible soft lockup with a contained and
well-understood tweak, this is an excellent candidate for stable
backporting.

 kernel/trace/ftrace.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a69067367c296..42bd2ba68a821 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -7535,6 +7535,8 @@ void ftrace_module_enable(struct module *mod)
 		if (!within_module(rec->ip, mod))
 			break;
 
+		cond_resched();
+
 		/* Weak functions should still be ignored */
 		if (!test_for_valid_rec(rec)) {
 			/* Clear all other flags. Should not be enabled anyway */
-- 
2.51.0