lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y50wg46lO8VuBPAe@Boquns-Mac-mini.local>
Date:   Fri, 16 Dec 2022 18:59:15 -0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Waiman Long <longman@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
        Wei Chen <harperchen1110@...il.com>, linux-ide@...r.kernel.org,
        linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
        syzbot <syzkaller@...glegroups.com>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Chuck Lever <chuck.lever@...cle.com>,
        Jeff Layton <jlayton@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: possible deadlock in __ata_sff_interrupt

On Fri, Dec 16, 2022 at 08:31:54PM -0600, Linus Torvalds wrote:
> Ok, let's bring in Waiman for the rwlock side.
> 
> On Fri, Dec 16, 2022 at 5:54 PM Boqun Feng <boqun.feng@...il.com> wrote:
> >
> > Right, for a reader not in_interrupt(), it may be blocked by a random
> > waiting writer because of the fairness, even the lock is currently held
> > by a reader:
> >
> >         CPU 1                   CPU 2           CPU 3
> >         read_lock(&tasklist_lock); // get the lock
> >
> >                                                 write_lock_irq(&tasklist_lock); // wait for the lock
> >
> >                                 read_lock(&tasklist_lock); // cannot get the lock because of the fairness
> 
> But this should be ok - because CPU1 can make progress and eventually
> release the lock.
> 

Yes.

> So the tasklist_lock use is fine on its own - the reason interrupts
> are special is because an interrupt on CPU 1 taking the lock for
> reading would deadlock otherwise. As long as it happens on another
> CPU, the original CPU should then be able to make progress.
> 
> But the problem here seems to be thst *another* lock is also involved
> (in this case apparently "host->lock", and now if CPU1 and CPU2 get
> these two locks in a different order, you can get an ABBA deadlock.
> 

Right.

> And apparently our lockdep machinery doesn't catch that issue, so it
> doesn't get flagged.
> 

I'm confused. Isn't the original problem showing that lockdep catches
this?

> I'm not sure what the lockdep rules for rwlocks are, but maybe lockdep
> treats rwlocks as being _always_ unfair, not knowing about that "it's
> only unfair when it's in interrupt context".
> 

The rules nowadays are:

*	If the reader is in_interrupt() or queued-spinlock implemention
	is not used, it's an unfair reader, i.e. it won't wait for
	any existing writer.

*	Otherwise, it's a fair reader.

> Maybe we need to always make rwlock unfair? Possibly only for tasklist_lock?
> 

That's possible, but I need to make sure I understand the issue for
lockdep. It's that lockdep misses catching something or it has a false
positive?

Regards,
Boqun

> Oh, how I hate tasklist_lock. It's pretty much our one remaining "one
> big lock". It's been a pain for a long long time.
> 
>             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ