[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <173891599227.10177.12341111504955542995.tip-bot2@tip-bot2>
Date: Fri, 07 Feb 2025 08:13:12 -0000
From: "tip-bot2 for Frederic Weisbecker" <tip-bot2@...utronix.de>
To: linux-tip-commits@...r.kernel.org
Cc: Matt Fleming <mfleming@...udflare.com>,
Frederic Weisbecker <frederic@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, stable@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject:
[tip: timers/urgent] timers/migration: Fix off-by-one root mis-connection
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: 868c9037df626b3c245ee26a290a03ae1f9f58d3
Gitweb: https://git.kernel.org/tip/868c9037df626b3c245ee26a290a03ae1f9f58d3
Author: Frederic Weisbecker <frederic@...nel.org>
AuthorDate: Wed, 05 Feb 2025 17:02:20 +01:00
Committer: Thomas Gleixner <tglx@...utronix.de>
CommitterDate: Fri, 07 Feb 2025 09:02:16 +01:00
timers/migration: Fix off-by-one root mis-connection
Before attaching a new root to the old root, the children counter of the
new root is checked to verify that only the upcoming CPU's top group have
been connected to it. However since the recently added commit b729cc1ec21a
("timers/migration: Fix another race between hotplug and idle entry/exit")
this check is not valid anymore because the old root is pre-accounted
as a child to the new root. Therefore after connecting the upcoming
CPU's top group to the new root, the children count to be expected must
be 2 and not 1 anymore.
This omission results in the old root to not be connected to the new
root. Then eventually the system may run with more than one top level,
which defeats the purpose of a single idle migrator.
Also the old root is pre-accounted but not connected upon the new root
creation. But it can be connected to the new root later on. Therefore
the old root may be accounted twice to the new root. The propagation of
such overcommit can end up creating a double final top-level root with a
groupmask incorrectly initialized. Although harmless given that the final
top level roots will never have a parent to walk up to, this oddity
opportunistically reported the core issue:
WARNING: CPU: 8 PID: 0 at kernel/time/timer_migration.c:543 tmigr_requires_handle_remote
CPU: 8 UID: 0 PID: 0 Comm: swapper/8
RIP: 0010:tmigr_requires_handle_remote
Call Trace:
<IRQ>
? tmigr_requires_handle_remote
? hrtimer_run_queues
update_process_times
tick_periodic
tick_handle_periodic
__sysvec_apic_timer_interrupt
sysvec_apic_timer_interrupt
</IRQ>
Fix the problem by taking the old root into account in the children count
of the new root so the connection is not omitted.
Also warn when more than one top level group exists to better detect
similar issues in the future.
Fixes: b729cc1ec21a ("timers/migration: Fix another race between hotplug and idle entry/exit")
Reported-by: Matt Fleming <mfleming@...udflare.com>
Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Cc: stable@...r.kernel.org
Link: https://lore.kernel.org/all/20250205160220.39467-1-frederic@kernel.org
---
kernel/time/timer_migration.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
index 9cb9b65..2f63308 100644
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -1675,6 +1675,9 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node)
} while (i < tmigr_hierarchy_levels);
+ /* Assert single root */
+ WARN_ON_ONCE(!err && !group->parent && !list_is_singular(&tmigr_level_list[top]));
+
while (i > 0) {
group = stack[--i];
@@ -1716,7 +1719,12 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node)
WARN_ON_ONCE(top == 0);
lvllist = &tmigr_level_list[top];
- if (group->num_children == 1 && list_is_singular(lvllist)) {
+
+ /*
+ * Newly created root level should have accounted the upcoming
+ * CPU's child group and pre-accounted the old root.
+ */
+ if (group->num_children == 2 && list_is_singular(lvllist)) {
/*
* The target CPU must never do the prepare work, except
* on early boot when the boot CPU is the target. Otherwise
Powered by blists - more mailing lists