[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250530183140.6cfad3ae@kernel.org>
Date: Fri, 30 May 2025 18:31:40 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Joe Damato <jdamato@...tly.com>
Cc: Stanislav Fomichev <stfomichev@...il.com>, netdev@...r.kernel.org,
john.cs.hey@...il.com, jacob.e.keller@...el.com,
syzbot+846bb38dc67fe62cc733@...kaller.appspotmail.com, Tony Nguyen
<anthony.l.nguyen@...el.com>, Przemek Kitszel
<przemyslaw.kitszel@...el.com>, Andrew Lunn <andrew+netdev@...n.ch>, "David
S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Paolo
Abeni <pabeni@...hat.com>, "moderated list:INTEL ETHERNET DRIVERS"
<intel-wired-lan@...ts.osuosl.org>, open list
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH iwl-net] e1000: Move cancel_work_sync to avoid deadlock
On Fri, 30 May 2025 12:45:13 -0700 Joe Damato wrote:
> > nit: as Jakub mentioned in another thread, it seems more about the
> > flush_work waiting for the reset_task to complete rather than
> > wq mutexes (which are fake)?
>
> Hm, I probably misunderstood something. Also, not sure what you
> meant by the wq mutexes being fake?
>
> My understanding (which is prob wrong) from the syzbot and user
> report was that the order of wq mutex and rtnl are inverted in the
> two paths, which can cause a deadlock if both paths run.
Take a look at touch_work_lockdep_map(), theres nosaj thing as wq mutex.
It's just a lockdep "annotation" that helps lockdep connect the dots
between waiting thread and the work item, not a real mutex. So the
commit msg may be better phrased like this (modulo the lines in front):
CPU 0:
, - RTNL is held
/ - e1000_close
| - e1000_down
+- - cancel_work_sync (cancel / wait for e1000_reset_task())
|
| CPU 1:
| - process_one_work
\ - e1000_reset_task
`- take RTNL
Powered by blists - more mailing lists