[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2025022709-CVE-2025-21816-bbd4@gregkh>
Date: Thu, 27 Feb 2025 12:03:13 -0800
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: CVE-2025-21816: hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING
hrtimers are migrated away from the dying CPU to any online target at
the CPUHP_AP_HRTIMERS_DYING stage in order not to delay bandwidth timers
handling tasks involved in the CPU hotplug forward progress.
However wakeups can still be performed by the outgoing CPU after
CPUHP_AP_HRTIMERS_DYING. Those can result again in bandwidth timers being
armed. Depending on several considerations (crystal ball power management
based election, earliest timer already enqueued, timer migration enabled or
not), the target may eventually be the current CPU even if offline. If that
happens, the timer is eventually ignored.
The most notable example is RCU which had to deal with each and every of
those wake-ups by deferring them to an online CPU, along with related
workarounds:
_ e787644caf76 (rcu: Defer RCU kthreads wakeup when CPU is dying)
_ 9139f93209d1 (rcu/nocb: Fix RT throttling hrtimer armed from offline CPU)
_ f7345ccc62a4 (rcu/nocb: Fix rcuog wake-up from offline softirq)
The problem isn't confined to RCU though as the stop machine kthread
(which runs CPUHP_AP_HRTIMERS_DYING) reports its completion at the end
of its work through cpu_stop_signal_done() and performs a wake up that
eventually arms the deadline server timer:
WARNING: CPU: 94 PID: 588 at kernel/time/hrtimer.c:1086 hrtimer_start_range_ns+0x289/0x2d0
CPU: 94 UID: 0 PID: 588 Comm: migration/94 Not tainted
Stopper: multi_cpu_stop+0x0/0x120 <- stop_machine_cpuslocked+0x66/0xc0
RIP: 0010:hrtimer_start_range_ns+0x289/0x2d0
Call Trace:
<TASK>
start_dl_timer
enqueue_dl_entity
dl_server_start
enqueue_task_fair
enqueue_task
ttwu_do_activate
try_to_wake_up
complete
cpu_stopper_thread
Instead of providing yet another bandaid to work around the situation, fix
it in the hrtimers infrastructure instead: always migrate away a timer to
an online target whenever it is enqueued from an offline CPU.
This will also allow to revert all the above RCU disgraceful hacks.
The Linux kernel CVE team has assigned CVE-2025-21816 to this issue.
Affected and fixed versions
===========================
Issue introduced in 6.7 with commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 and fixed in 6.12.14 with commit e456a88bddae4030ba962447bb84be6669f2a0c1
Issue introduced in 6.7 with commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 and fixed in 6.13.3 with commit 2aecec58e9040ce3d2694707889f9914a2374955
Issue introduced in 6.7 with commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 and fixed in 6.14-rc2 with commit 53dac345395c0d2493cbc2f4c85fe38aef5b63f5
Issue introduced in 4.19.302 with commit 9a2fc41acb69dd4e2a58d0c04346c3333c2341fc
Issue introduced in 5.4.264 with commit 54d0d83a53508d687fd4a225f8aa1f18559562d0
Issue introduced in 5.10.204 with commit 7f4c89400d2997939f6971c7981cc780a219e36b
Issue introduced in 5.15.143 with commit 6fcbcc6c8e52650749692c7613cbe71bf601670d
Issue introduced in 6.1.68 with commit 75b5016ce325f1ef9c63e5398a1064cf8a7a7354
Issue introduced in 6.6.7 with commit 53f408cad05bb987af860af22f4151e5a18e6ee8
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2025-21816
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
include/linux/hrtimer_defs.h
kernel/time/hrtimer.c
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/e456a88bddae4030ba962447bb84be6669f2a0c1
https://git.kernel.org/stable/c/2aecec58e9040ce3d2694707889f9914a2374955
https://git.kernel.org/stable/c/53dac345395c0d2493cbc2f4c85fe38aef5b63f5
Powered by blists - more mailing lists