[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgfpfWuNQi2SjXQL1ir6iKCpUdBruJ+kmOQP1frH7Zdig@mail.gmail.com>
Date: Wed, 16 Feb 2022 10:09:14 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Damien Le Moal <damien.lemoal@...nsource.wdc.com>
Cc: Byungchul Park <byungchul.park@....com>, linux-ide@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Steven Rostedt <rostedt@...dmis.org>,
Joel Fernandes <joel@...lfernandes.org>,
Sasha Levin <sashal@...nel.org>,
Daniel Vetter <daniel.vetter@...ll.ch>,
Chris Wilson <chris@...is-wilson.co.uk>, duyuyang@...il.com,
johannes.berg@...el.com, Tejun Heo <tj@...nel.org>,
"Theodore Ts'o" <tytso@....edu>,
Matthew Wilcox <willy@...radead.org>,
Dave Chinner <david@...morbit.com>,
Amir Goldstein <amir73il@...il.com>,
"J. Bruce Fields" <bfields@...ldses.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
kernel-team@....com, Linux-MM <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...nel.org>,
Minchan Kim <minchan@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Vladimir Davydov <vdavydov.dev@...il.com>, sj@...nel.org,
Jerome Glisse <jglisse@...hat.com>,
Dennis Zhou <dennis@...nel.org>,
Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>, ngupta@...are.org,
linux-block <linux-block@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>, paolo.valente@...aro.org,
Josef Bacik <josef@...icpanda.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>, Jan Kara <jack@...e.cz>,
Jeff Layton <jlayton@...nel.org>,
Dan Williams <dan.j.williams@...el.com>,
Christoph Hellwig <hch@...radead.org>,
"Darrick J. Wong" <djwong@...nel.org>,
dri-devel <dri-devel@...ts.freedesktop.org>,
Dave Airlie <airlied@...ux.ie>, rodrigosiqueiramelo@...il.com,
melissa.srw@...il.com, hamohammed.sa@...il.com
Subject: Re: Report in ata_scsi_port_error_handler()
On Tue, Feb 15, 2022 at 10:37 PM Damien Le Moal
<damien.lemoal@...nsource.wdc.com> wrote:
>
> On 2/16/22 13:16, Byungchul Park wrote:
> > [ 2.051040] ===================================================
> > [ 2.051406] DEPT: Circular dependency has been detected.
> > [ 2.051730] 5.17.0-rc1-00014-gcf3441bb2012 #2 Tainted: G W
> > [ 2.051991] ---------------------------------------------------
> > [ 2.051991] summary
> > [ 2.051991] ---------------------------------------------------
> > [ 2.051991] *** DEADLOCK ***
> > [ 2.051991]
> > [ 2.051991] context A
> > [ 2.051991] [S] (unknown)(&(&ap->eh_wait_q)->dmap:0)
> > [ 2.051991] [W] __raw_spin_lock_irq(&host->lock:0)
> > [ 2.051991] [E] event(&(&ap->eh_wait_q)->dmap:0)
> > [ 2.051991]
> > [ 2.051991] context B
> > [ 2.051991] [S] __raw_spin_lock_irqsave(&host->lock:0)
> > [ 2.051991] [W] wait(&(&ap->eh_wait_q)->dmap:0)
> > [ 2.051991] [E] spin_unlock(&host->lock:0)
>
> Sleeping with a spinlock held would be triggering warnings already, so
> these reports seem bogus to me.
Yeah, Matthew pointed out the same thing for another use-case, where
it looks like DEPT is looking at the state at the wrong point (not at
the scheduling point, but at prepare_to_sleep()).
This ata_port_wait() is the exact same pattern, ie we have
spin_lock_irqsave(ap->lock, flags);
while (ap->pflags & (ATA_PFLAG_EH_PENDING | ATA_PFLAG_EH_IN_PROGRESS)) {
prepare_to_wait(&ap->eh_wait_q, &wait, TASK_UNINTERRUPTIBLE);
spin_unlock_irqrestore(ap->lock, flags);
schedule();
and DEPT has incorrectly taken it to mean that 'ap->lock' is held
during the wait, when it is actually released before actually waiting.
For the spin-locks, this is all very obvious (because they'd have been
caught long ago by much simpler debug code), but the same
prepare_to_wait -> wait pattern can most definitely happen with
sleeping locks too, so they are all slightly suspect.
And yes, the detailed reports are hard to read because the locations
are given as "ata_port_wait_eh+0x52/0xc0". Running them through
scripts/decode_stacktrace.sh to turn them into filename and line
numbers - and also sort out inlining - would help a lot.
Byungchul, could you fix those two issues? Some of your reports may
well be entirely valid, but the hard-to-read hex offsets and the
knowledge that at least some of them are confused about how
prepare_to_wait -> wait actually works makes the motivation to look at
the details much less..
Linus
Powered by blists - more mailing lists