lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxZxLK7eSQ_bwkLe@ryzen.lan>
Date: Mon, 21 Oct 2024 17:20:12 +0200
From: Niklas Cassel <cassel@...nel.org>
To: "Lai, Yi" <yi1.lai@...ux.intel.com>
Cc: Damien Le Moal <dlemoal@...nel.org>, Hannes Reinecke <hare@...e.de>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Igor Pylypiv <ipylypiv@...gle.com>,
	Niklas Cassel <niklas.cassel@....com>, linux-ide@...r.kernel.org,
	yi1.lai@...el.com, syzkaller-bugs@...glegroups.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] ata: libata: Clear DID_TIME_OUT for ATA PT commands
 with sense data

On Mon, Oct 21, 2024 at 02:07:21PM +0200, Niklas Cassel wrote:
> Hello Yi Lai,
> 
> On Mon, Oct 21, 2024 at 06:58:59PM +0800, Lai, Yi wrote:
> > Hi Niklas Cassel,
> > 
> > Greetings!
> > 
> > I used Syzkaller and found that there is INFO: task hung in blk_mq_get_tag in v6.12-rc3
> > 
> > After bisection and the first bad commit is:
> > "
> > e5dd410acb34 ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data
> > "
> 
> It might be that your bisection results are accurate.
> 
> However, after looking at the stacktraces, I find it way more likely that
> bisection has landed on the wrong commit.
> 
> See this series that was just queued (for 6.13) a few days ago that solves a
> similar starvation:
> https://lore.kernel.org/linux-block/20241014092934.53630-1-songmuchun@bytedance.com/
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-6.13/block
> 
> You could perhaps run with v6.14-rc4 (which should be able to trigger the bug)
> and then try v6.14-rc4 + that series applied, to see if you can still trigger
> the bug?

Another patch that might be relevant:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e972b08b91ef48488bae9789f03cfedb148667fb

Which fixes a use after delete in rq_qos_wake_function().
(We can see that the stack trace has rq_qos_wake_function() before
getting stuck forever in rq_qos_wait())

Who knows what could go wrong when accessing a deleted entry, in the
report there was a crash, but I could image other surprises :)
The fix was first included in v6.12-rc4.


Kind regards,
Niklas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ