linux-kernel - Re: [PATCHv2] thunderbolt: do not double dequeue a request

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250327142038.GB3152277@black.fi.intel.com>
Date: Thu, 27 Mar 2025 16:20:38 +0200
From: Mika Westerberg <mika.westerberg@...ux.intel.com>
To: Sergey Senozhatsky <senozhatsky@...omium.org>
Cc: Andreas Noever <andreas.noever@...il.com>,
	Michael Jamet <michael.jamet@...el.com>,
	Mika Westerberg <westeri@...nel.org>,
	Yehezkel Bernat <YehezkelShB@...il.com>, linux-usb@...r.kernel.org,
	linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCHv2] thunderbolt: do not double dequeue a request

Hi,

On Thu, Mar 27, 2025 at 11:02:04PM +0900, Sergey Senozhatsky wrote:
> Hi,
> 
> On (25/03/27 15:37), Mika Westerberg wrote:
> > > Another possibility can be tb_cfg_request_sync():
> > > 
> > > tb_cfg_request_sync()
> > >  tb_cfg_request()
> > >   schedule_work(&req->work) -> tb_cfg_request_dequeue()
> > >  tb_cfg_request_cancel()
> > >   schedule_work(&req->work) -> tb_cfg_request_dequeue()
> > 
> > Not sure about this one because &req->work will only be scheduled once the
> > second schedule_work() should not queue it again (as far as I can tell).
> 
> If the second schedule_work() happens after a timeout, that's what
> !wait_for_completion_timeout() does, then the first schedule_work()
> can already execute the work by that time, and then we can schedule
> the work again (but the request is already dequeued).  Am I missing
> something?

schedule_work() does not schedule the work again if it is already
scheduled.

> > > To address the issue, do not dequeue requests that don't
> > > have TB_CFG_REQUEST_ACTIVE bit set.
> > 
> > Just to be sure. After this change you have not seen the issue anymore
> > with your testing?
> 
> Haven't tried it yet.
> 
> We just found it today, it usually takes several weeks before
> we can roll out the fix to our fleet and we prefer patches from
> upstream/subsystem git, so that's why we reach out to the upstream.

Makes sense.

> The 0xdead000000000122 deference is a LIST_POISON on x86_64, which
> is set explicitly in list_del(), so I'd say I'm fairly confident
> that we have a double list_del() in tb_cfg_request_dequeue().

Yes, I agree but since I have not seen any similar reports (sans what I saw
ages ago), I would like to be sure the issue you see is actually fixed with
the patch (and that there are no unexpected side-effects). ;-)