[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250718132500.wq4ahdbiwk2dwwnw@skbuf>
Date: Fri, 18 Jul 2025 16:25:00 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: Jeongjun Park <aha310510@...il.com>
Cc: richardcochran@...il.com, andrew+netdev@...n.ch, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
yangbo.lu@....com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org,
syzbot+7cfb66a237c4a5fb22ad@...kaller.appspotmail.com
Subject: Re: [PATCH net v2] ptp: prevent possible ABBA deadlock in
ptp_clock_freerun()
On Fri, Jul 18, 2025 at 08:49:58PM +0900, Jeongjun Park wrote:
> ABBA deadlock occurs in the following scenario:
>
> CPU0 CPU1
> ---- ----
> n_vclocks_store()
> lock(&ptp->n_vclocks_mux) [1]
> pc_clock_adjtime()
> lock(&clk->rwsem) [2]
> ...
> ptp_clock_freerun()
> ptp_vclock_in_use()
> lock(&ptp->n_vclocks_mux) [3]
> ptp_clock_unregister()
> posix_clock_unregister()
> lock(&clk->rwsem) [4]
>
> To solve this with minimal patches, we should change ptp_clock_freerun()
> to briefly release the read lock before calling ptp_vclock_in_use() and
> then re-lock it when we're done.
The most important part of solving a problem is understanding the problem
that there is to solve. It appears that you've jumped over that step.
The n_vclocks sysfs is exposed for a physical clock, and acquires the
physical clock's n_vclocks_mux, as shown in your diagram at step [1].
Another process calls pc_clock_adjtime(), acquires &clk->rwsem at step [2],
and calls ptp_clock_adjtime(). This further tests ptp_clock_freerun() ->
ptp_vclock_in_use(), and the fact that ptp_vclock_in_use() gets to acquire
n_vclocks_mux at step [3] means, as per its implementation modified in
commit 5ab73b010cad ("ptp: fix breakage after ptp_vclock_in_use() rework"),
that the PTP clock modified by pc_clock_adjtime() could have only been a
physical clock (ptp->is_virtual_clock == false). This is because we do
not acquire n_vclocks_mux on virtual clocks.
Back to the CPU0 code path, where we iterate over the physical PTP
clock's virtual clocks, and call device_for_each_child_reverse(...,
unregister_vclock) -> ptp_vclock_unregister() -> ptp_clock_unregister() ->
posix_clock_unregister() on them. During the unregister procedure,
posix_clock_unregister() acquires the virtual clock's &clk->rwsem as
shown in your final step [4].
It is clear that the clock which CPU0 unregisters cannot be the same as
the clock which CPU1 adjusts, because the unregistering CPU0 clock is
virtual, and the adjusted CPU1 clock is physical.
The crucial bit of information from lockdep's message "WARNING: possible
circular locking dependency detected" is the word "possible".
See Documentation/locking/lockdep-design.rst section "Lock-class" to
understand that lockdep does not operate on individual locks, but
instead on "classes". Therefore, simply put, it does not see, in lack of
any extra annotations, that the &clk->rwsem of the physical clock is
different than the &clk->rwsem of a child virtual clock. They have the
same class.
Therefore, there is no AB/BA ordering between locks themselves, because
the first "A" is the &clk->rwsem of a physical clock, and the second "A"
is the &clk->rwsem of a virtual clock. The "B" lock may be the same: the
&ptp->n_vclocks_mux of the physical clock.
Of course, having lockdep be able to validate locking using its
class-based algorithm is still important, and a patch is still needed.
The solution here is at your choice, but the problem space that needs to
be explored in order to fulfill that is extremely different from the
patch that you've proposed, to the point that I don't even think I need
to mention that a patch that makes an unsafe funtional change (drops a
lock and reacquires it), when alternatives having to do with annotating
lock subclasses are available and sufficient, is going to get a well
justified NACK from maintainers.
Powered by blists - more mailing lists