[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220228101444.6frl63dn5vmgycbp@quack3.lan>
Date: Mon, 28 Feb 2022 11:14:44 +0100
From: Jan Kara <jack@...e.cz>
To: Byungchul Park <byungchul.park@....com>
Cc: Jan Kara <jack@...e.cz>, torvalds@...ux-foundation.org,
damien.lemoal@...nsource.wdc.com, linux-ide@...r.kernel.org,
adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
mingo@...hat.com, linux-kernel@...r.kernel.org,
peterz@...radead.org, will@...nel.org, tglx@...utronix.de,
rostedt@...dmis.org, joel@...lfernandes.org, sashal@...nel.org,
daniel.vetter@...ll.ch, chris@...is-wilson.co.uk,
duyuyang@...il.com, johannes.berg@...el.com, tj@...nel.org,
tytso@....edu, willy@...radead.org, david@...morbit.com,
amir73il@...il.com, bfields@...ldses.org,
gregkh@...uxfoundation.org, kernel-team@....com,
linux-mm@...ck.org, akpm@...ux-foundation.org, mhocko@...nel.org,
minchan@...nel.org, hannes@...xchg.org, vdavydov.dev@...il.com,
sj@...nel.org, jglisse@...hat.com, dennis@...nel.org, cl@...ux.com,
penberg@...nel.org, rientjes@...gle.com, vbabka@...e.cz,
ngupta@...are.org, linux-block@...r.kernel.org, axboe@...nel.dk,
paolo.valente@...aro.org, josef@...icpanda.com,
linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
jack@...e.com, jlayton@...nel.org, dan.j.williams@...el.com,
hch@...radead.org, djwong@...nel.org,
dri-devel@...ts.freedesktop.org, airlied@...ux.ie,
rodrigosiqueiramelo@...il.com, melissa.srw@...il.com,
hamohammed.sa@...il.com
Subject: Re: Report 2 in ext4 and journal based on v5.17-rc1
On Mon 28-02-22 18:28:26, Byungchul Park wrote:
> On Thu, Feb 24, 2022 at 11:22:39AM +0100, Jan Kara wrote:
> > On Thu 24-02-22 10:11:02, Byungchul Park wrote:
> > > On Wed, Feb 23, 2022 at 03:48:59PM +0100, Jan Kara wrote:
> > > > > KJOURNALD2(kthread) TASK1(ksys_write) TASK2(ksys_write)
> > > > >
> > > > > wait A
> > > > > --- stuck
> > > > > wait B
> > > > > --- stuck
> > > > > wait C
> > > > > --- stuck
> > > > >
> > > > > wake up B wake up C wake up A
> > > > >
> > > > > where:
> > > > > A is a wait_queue, j_wait_commit
> > > > > B is a wait_queue, j_wait_transaction_locked
> > > > > C is a rwsem, mapping.invalidate_lock
> > > >
> > > > I see. But a situation like this is not necessarily a guarantee of a
> > > > deadlock, is it? I mean there can be task D that will eventually call say
> > > > 'wake up B' and unblock everything and this is how things were designed to
> > > > work? Multiple sources of wakeups are quite common I'd say... What does
> > >
> > > Yes. At the very beginning when I desgined Dept, I was thinking whether
> > > to support multiple wakeup sources or not for a quite long time.
> > > Supporting it would be a better option to aovid non-critical reports.
> > > However, I thought anyway we'd better fix it - not urgent tho - if
> > > there's any single circle dependency. That's why I decided not to
> > > support it for now and wanted to gather the kernel guys' opinions. Thing
> > > is which policy we should go with.
> >
> > I see. So supporting only a single wakeup source is fine for locks I guess.
> > But for general wait queues or other synchronization mechanisms, I'm afraid
> > it will lead to quite some false positive reports. Just my 2c.
>
> Thank you for your feedback.
>
> I realized we've been using "false positive" differently. There exist
> the three types of code in terms of dependency and deadlock. It's worth
> noting that dependencies are built from between waits and events in Dept.
>
> ---
>
> case 1. Code with an actual circular dependency, but not deadlock.
>
> A circular dependency can be broken by a rescue wakeup source e.g.
> timeout. It's not a deadlock. If it's okay that the contexts
> participating in the circular dependency and others waiting for the
> events in the circle are stuck until it gets broken. Otherwise, say,
> if it's not meant, then it's anyway problematic.
>
> 1-1. What if we judge this code is problematic?
> 1-2. What if we judge this code is good?
>
> case 2. Code with an actual circular dependency, and deadlock.
>
> There's no other wakeup source than those within the circular
> dependency. Literally deadlock. It's problematic and critical.
>
> 2-1. What if we judge this code is problematic?
> 2-2. What if we judge this code is good?
>
> case 3. Code with no actual circular dependency, and not deadlock.
>
> Must be good.
>
> 3-1. What if we judge this code is problematic?
> 3-2. What if we judge this code is good?
>
> ---
>
> I call only 3-1 "false positive" circular dependency. And you call 1-1
> and 3-1 "false positive" deadlock.
>
> I've been wondering if the kernel guys esp. Linus considers code with
> any circular dependency is problematic or not, even if it won't lead to
> a deadlock, say, case 1. Even though I designed Dept based on what I
> believe is right, of course, I'm willing to change the design according
> to the majority opinion.
>
> However, I would never allow case 1 if I were the owner of the kernel
> for better stability, even though the code works anyway okay for now.
So yes, I call a report for the situation "There is circular dependency but
deadlock is not possible." a false positive. And that is because in my
opinion your definition of circular dependency includes schemes that are
useful and used in the kernel.
Your example in case 1 is kind of borderline (I personally would consider
that bug as well) but there are other more valid schemes with multiple
wakeup sources like:
We have a queue of work to do Q protected by lock L. Consumer process has
code like:
while (1) {
lock L
prepare_to_wait(work_queued);
if (no work) {
unlock L
sleep
} else {
unlock L
do work
wake_up(work_done)
}
}
AFAIU Dept will create dependency here that 'wakeup work_done' is after
'wait for work_queued'. Producer has code like:
while (1) {
lock L
prepare_to_wait(work_done)
if (too much work queued) {
unlock L
sleep
} else {
queue work
unlock L
wake_up(work_queued)
}
}
And Dept will create dependency here that 'wakeup work_queued' is after
'wait for work_done'. And thus we have a trivial cycle in the dependencies
despite the code being perfectly valid and safe.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists