[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2024080740-CVE-2024-42239-a15f@gregkh>
Date: Wed, 7 Aug 2024 17:14:45 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: CVE-2024-42239: bpf: Fail bpf_timer_cancel when callback is being cancelled
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
bpf: Fail bpf_timer_cancel when callback is being cancelled
Given a schedule:
timer1 cb timer2 cb
bpf_timer_cancel(timer2); bpf_timer_cancel(timer1);
Both bpf_timer_cancel calls would wait for the other callback to finish
executing, introducing a lockup.
Add an atomic_t count named 'cancelling' in bpf_hrtimer. This keeps
track of all in-flight cancellation requests for a given BPF timer.
Whenever cancelling a BPF timer, we must check if we have outstanding
cancellation requests, and if so, we must fail the operation with an
error (-EDEADLK) since cancellation is synchronous and waits for the
callback to finish executing. This implies that we can enter a deadlock
situation involving two or more timer callbacks executing in parallel
and attempting to cancel one another.
Note that we avoid incrementing the cancelling counter for the target
timer (the one being cancelled) if bpf_timer_cancel is not invoked from
a callback, to avoid spurious errors. The whole point of detecting
cur->cancelling and returning -EDEADLK is to not enter a busy wait loop
(which may or may not lead to a lockup). This does not apply in case the
caller is in a non-callback context, the other side can continue to
cancel as it sees fit without running into errors.
Background on prior attempts:
Earlier versions of this patch used a bool 'cancelling' bit and used the
following pattern under timer->lock to publish cancellation status.
lock(t->lock);
t->cancelling = true;
mb();
if (cur->cancelling)
return -EDEADLK;
unlock(t->lock);
hrtimer_cancel(t->timer);
t->cancelling = false;
The store outside the critical section could overwrite a parallel
requests t->cancelling assignment to true, to ensure the parallely
executing callback observes its cancellation status.
It would be necessary to clear this cancelling bit once hrtimer_cancel
is done, but lack of serialization introduced races. Another option was
explored where bpf_timer_start would clear the bit when (re)starting the
timer under timer->lock. This would ensure serialized access to the
cancelling bit, but may allow it to be cleared before in-flight
hrtimer_cancel has finished executing, such that lockups can occur
again.
Thus, we choose an atomic counter to keep track of all outstanding
cancellation requests and use it to prevent lockups in case callbacks
attempt to cancel each other while executing in parallel.
The Linux kernel CVE team has assigned CVE-2024-42239 to this issue.
Affected and fixed versions
===========================
Issue introduced in 5.15 with commit b00628b1c7d5 and fixed in 6.6.41 with commit 936983051868
Issue introduced in 5.15 with commit b00628b1c7d5 and fixed in 6.9.10 with commit 3e4e8178a866
Issue introduced in 5.15 with commit b00628b1c7d5 and fixed in 6.10 with commit d4523831f07a
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2024-42239
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
kernel/bpf/helpers.c
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/9369830518688ecd5b08ffc08ab3302ce2b5d0f7
https://git.kernel.org/stable/c/3e4e8178a8666c56813bd167b848fca0f4c9af0a
https://git.kernel.org/stable/c/d4523831f07a267a943f0dde844bf8ead7495f13
Powered by blists - more mailing lists