[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zl9zOH2hUramwNSi@slm.duckdns.org>
Date: Tue, 4 Jun 2024 10:04:08 -1000
From: Tejun Heo <tj@...nel.org>
To: Leon Romanovsky <leon@...nel.org>
Cc: Hillf Danton <hdanton@...a.com>, Peter Zijlstra <peterz@...radead.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>, linux-kernel@...r.kernel.org,
Gal Pressman <gal@...dia.com>, Tariq Toukan <tariqt@...dia.com>,
RDMA mailing list <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH -rc] workqueue: Reimplement UAF fix to avoid lockdep
worning
Hello,
On Tue, Jun 04, 2024 at 09:58:04PM +0300, Leon Romanovsky wrote:
> But at that point, we didn't add newly created WQ to any list which will execute
> that asynchronous release. Did I miss something?
So, wq itself is not the problem. There are multiple pwq's that get attached
to a wq and each pwq is refcnt'd and released asynchronously. Over time,
during wq init, how the error paths behave diverged - pwq's still take the
async path while wq error path stayed synchronous. The flush is there to
match them. A cleaner solution would be either turning everything async or
sync.
> Anyway, I understand that the lockdep_register_key() corruption comes
> from something else. Do you have any idea what can cause it? How can we
> help debug this issue?
It looks like other guys are already looking at another commit, but focusing
on the backtrace which prematurely freed the reported object (rather than
the backtrace which stumbled upon it while walking shared data structure)
should help finding the actual culprit.
Thanks.
--
tejun
Powered by blists - more mailing lists