lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj7FpAXZ0hnPKh-5CG-ZW8BmOhd4tEW+J7ryW26fkcDNA@mail.gmail.com>
Date:   Fri, 16 Dec 2022 20:31:54 -0600
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Boqun Feng <boqun.feng@...il.com>, Waiman Long <longman@...hat.com>
Cc:     Al Viro <viro@...iv.linux.org.uk>,
        Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
        Wei Chen <harperchen1110@...il.com>, linux-ide@...r.kernel.org,
        linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
        syzbot <syzkaller@...glegroups.com>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Chuck Lever <chuck.lever@...cle.com>,
        Jeff Layton <jlayton@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: possible deadlock in __ata_sff_interrupt

Ok, let's bring in Waiman for the rwlock side.

On Fri, Dec 16, 2022 at 5:54 PM Boqun Feng <boqun.feng@...il.com> wrote:
>
> Right, for a reader not in_interrupt(), it may be blocked by a random
> waiting writer because of the fairness, even the lock is currently held
> by a reader:
>
>         CPU 1                   CPU 2           CPU 3
>         read_lock(&tasklist_lock); // get the lock
>
>                                                 write_lock_irq(&tasklist_lock); // wait for the lock
>
>                                 read_lock(&tasklist_lock); // cannot get the lock because of the fairness

But this should be ok - because CPU1 can make progress and eventually
release the lock.

So the tasklist_lock use is fine on its own - the reason interrupts
are special is because an interrupt on CPU 1 taking the lock for
reading would deadlock otherwise. As long as it happens on another
CPU, the original CPU should then be able to make progress.

But the problem here seems to be thst *another* lock is also involved
(in this case apparently "host->lock", and now if CPU1 and CPU2 get
these two locks in a different order, you can get an ABBA deadlock.

And apparently our lockdep machinery doesn't catch that issue, so it
doesn't get flagged.

I'm not sure what the lockdep rules for rwlocks are, but maybe lockdep
treats rwlocks as being _always_ unfair, not knowing about that "it's
only unfair when it's in interrupt context".

Maybe we need to always make rwlock unfair? Possibly only for tasklist_lock?

Oh, how I hate tasklist_lock. It's pretty much our one remaining "one
big lock". It's been a pain for a long long time.

            Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ