[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zxd2hZWt1zm4eW2q@ly-workstation>
Date: Tue, 22 Oct 2024 17:55:17 +0800
From: "Lai, Yi" <yi1.lai@...ux.intel.com>
To: Niklas Cassel <cassel@...nel.org>
Cc: Damien Le Moal <dlemoal@...nel.org>, Hannes Reinecke <hare@...e.de>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Igor Pylypiv <ipylypiv@...gle.com>,
Niklas Cassel <niklas.cassel@....com>, linux-ide@...r.kernel.org,
yi1.lai@...el.com, syzkaller-bugs@...glegroups.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] ata: libata: Clear DID_TIME_OUT for ATA PT commands
with sense data
On Tue, Oct 22, 2024 at 10:48:04AM +0200, Niklas Cassel wrote:
> On Tue, Oct 22, 2024 at 01:44:08PM +0800, Lai, Yi wrote:
> > On Mon, Oct 21, 2024 at 05:20:12PM +0200, Niklas Cassel wrote:
> > > On Mon, Oct 21, 2024 at 02:07:21PM +0200, Niklas Cassel wrote:
> > > > Hello Yi Lai,
> > > >
> > > > On Mon, Oct 21, 2024 at 06:58:59PM +0800, Lai, Yi wrote:
> > > > > Hi Niklas Cassel,
> > > > >
> > > > > Greetings!
> > > > >
> > > > > I used Syzkaller and found that there is INFO: task hung in blk_mq_get_tag in v6.12-rc3
> > > > >
> > > > > After bisection and the first bad commit is:
> > > > > "
> > > > > e5dd410acb34 ata: libata: Clear DID_TIME_OUT for ATA PT commands with sense data
> > > > > "
> > > >
> > > > It might be that your bisection results are accurate.
> > > >
> > > > However, after looking at the stacktraces, I find it way more likely that
> > > > bisection has landed on the wrong commit.
> > > >
> > > > See this series that was just queued (for 6.13) a few days ago that solves a
> > > > similar starvation:
> > > > https://lore.kernel.org/linux-block/20241014092934.53630-1-songmuchun@bytedance.com/
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-6.13/block
> > > >
> > > > You could perhaps run with v6.14-rc4 (which should be able to trigger the bug)
> > > > and then try v6.14-rc4 + that series applied, to see if you can still trigger
> > > > the bug?
> > >
I tried kernel linux-block
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
branch for-6.13/block commit c97f91b1807a7966077b69b24f28c2dbcde664e9.
Issue can still be reproduced.
> > > Another patch that might be relevant:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e972b08b91ef48488bae9789f03cfedb148667fb
> > >
> > > Which fixes a use after delete in rq_qos_wake_function().
> > > (We can see that the stack trace has rq_qos_wake_function() before
> > > getting stuck forever in rq_qos_wait())
> > >
> > > Who knows what could go wrong when accessing a deleted entry, in the
> > > report there was a crash, but I could image other surprises :)
> > > The fix was first included in v6.12-rc4.
> > >
> > >
> > Hi Niklas,
> >
> > Thanks for the info. I have tried using v6.12-rc4 kernel to reproduce
> > the issue. Using the same repro binary, the issue still exists.
>
> Thanks a lot for your help with testing!
>
> The first series that I pointed to, which looks most likely to be related:
> https://lore.kernel.org/linux-block/20241014092934.53630-1-songmuchun@bytedance.com/
>
> Is only merged in:
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-6.13/block
>
> It is not included in v6.12-rc4.
>
> Would it please be possible for you to test with Jens's for-6.13/block branch?
>
>
> Kind regards,
> Niklas
Powered by blists - more mailing lists