[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2024122729-CVE-2024-56547-c340@gregkh>
Date: Fri, 27 Dec 2024 15:11:38 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: CVE-2024-56547: rcu/nocb: Fix missed RCU barrier on deoffloading
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
rcu/nocb: Fix missed RCU barrier on deoffloading
Currently, running rcutorture test with torture_type=rcu fwd_progress=8
n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60
test_boost=2, will trigger the following warning:
WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0
RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0
Call Trace:
<TASK>
? __warn+0x7e/0x120
? rcu_nocb_rdp_deoffload+0x292/0x2a0
? report_bug+0x18e/0x1a0
? handle_bug+0x3d/0x70
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? rcu_nocb_rdp_deoffload+0x292/0x2a0
rcu_nocb_cpu_deoffload+0x70/0xa0
rcu_nocb_toggle+0x136/0x1c0
? __pfx_rcu_nocb_toggle+0x10/0x10
kthread+0xd1/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
CPU0 CPU2 CPU3
//rcu_nocb_toggle //nocb_cb_wait //rcutorture
// deoffload CPU1 // process CPU1's rdp
rcu_barrier()
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 2
// enqueue barrier
// callback to CPU1's
// rdp->cblist
rcu_do_batch()
// invoke CPU1's rdp->cblist
// callback
rcu_barrier_callback()
rcu_barrier()
mutex_lock(&rcu_state.barrier_mutex);
// still see len == 2
// enqueue barrier callback
// to CPU1's rdp->cblist
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 3
// decrement len
rcu_segcblist_add_len(-2);
kthread_parkme()
// CPU1's rdp->cblist len == 1
// Warn because there is
// still a pending barrier
// trigger warning
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
cpus_read_unlock();
// wait CPU1 to comes online and
// invoke barrier callback on
// CPU1 rdp's->cblist
wait_for_completion(&rcu_state.barrier_completion);
// deoffload CPU4
cpus_read_lock()
rcu_barrier()
mutex_lock(&rcu_state.barrier_mutex);
// block on barrier_mutex
// wait rcu_barrier() on
// CPU3 to unlock barrier_mutex
// but CPU3 unlock barrier_mutex
// need to wait CPU1 comes online
// when CPU1 going online will block on cpus_write_lock
The above scenario will not only trigger a WARN_ON_ONCE(), but also
trigger a deadlock.
Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU
will either observe the decremented callback counter down to 0 and spare
the callback enqueue, or rcuo will observe the new callback and keep
rdp->nocb_cb_sleep to false.
Therefore check rdp->nocb_cb_sleep before parking to make sure no
further rcu_barrier() is waiting on the rdp.
The Linux kernel CVE team has assigned CVE-2024-56547 to this issue.
Affected and fixed versions
===========================
Issue introduced in 6.12 with commit 1fcb932c8b5ce86219d7dedcd63659351a43291c and fixed in 6.12.2 with commit 224b62028959858294789772d372dcb36cf5f820
Issue introduced in 6.12 with commit 1fcb932c8b5ce86219d7dedcd63659351a43291c and fixed in 6.13-rc1 with commit 2996980e20b7a54a1869df15b3445374b850b155
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2024-56547
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
kernel/rcu/tree_nocb.h
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/224b62028959858294789772d372dcb36cf5f820
https://git.kernel.org/stable/c/2996980e20b7a54a1869df15b3445374b850b155
Powered by blists - more mailing lists