lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cd1554bb-bbcc-ddfc-f48e-33642dd46ddf@redhat.com>
Date:   Mon, 15 Jun 2020 20:31:01 -0400
From:   Waiman Long <longman@...hat.com>
To:     Boqun Feng <boqun.feng@...il.com>,
        Matthew Wilcox <willy@...radead.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot <syzbot+a9fb1457d720a55d6dc5@...kaller.appspotmail.com>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>, allison@...utok.net,
        areber@...hat.com, aubrey.li@...ux.intel.com,
        Andrei Vagin <avagin@...il.com>,
        Bruce Fields <bfields@...ldses.org>,
        Christian Brauner <christian@...uner.io>, cyphar@...har.com,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>, guro@...com,
        Jeff Layton <jlayton@...nel.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Kees Cook <keescook@...omium.org>, linmiaohe@...wei.com,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Michal Hocko <mhocko@...e.com>, Ingo Molnar <mingo@...nel.org>,
        Oleg Nesterov <oleg@...hat.com>, sargun@...gun.me,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: Re: possible deadlock in send_sigio

On 6/15/20 8:13 PM, Boqun Feng wrote:
> Hi Matthew,
>
> On Mon, Jun 15, 2020 at 01:40:46PM -0700, Matthew Wilcox wrote:
>> On Mon, Jun 15, 2020 at 01:13:51PM -0400, Waiman Long wrote:
>>> On 6/15/20 12:49 PM, Matthew Wilcox wrote:
>>>> On Fri, Jun 12, 2020 at 03:01:01PM +0800, Boqun Feng wrote:
>>>>> On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
>>>>> read lock, actually it's only recursive if in_interrupt() is true. So
>>>>> change the annotation accordingly to catch more deadlocks.
>>>> [...]
>>>>
>>>>> +#ifdef CONFIG_LOCKDEP
>>>>> +/*
>>>>> + * read_lock() is recursive if:
>>>>> + * 1. We force lockdep think this way in selftests or
>>>>> + * 2. The implementation is not queued read/write lock or
>>>>> + * 3. The locker is at an in_interrupt() context.
>>>>> + */
>>>>> +static inline bool read_lock_is_recursive(void)
>>>>> +{
>>>>> +	return force_read_lock_recursive ||
>>>>> +	       !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
>>>>> +	       in_interrupt();
>>>>> +}
>>>> I'm a bit uncomfortable with having the _lockdep_ definition of whether
>>>> a read lock is recursive depend on what the _implementation_ is.
>>>> The locking semantics should be the same, no matter which architecture
>>>> you're running on.  If we rely on read locks being recursive in common
>>>> code then we have a locking bug on architectures which don't use queued
>>>> rwlocks.
>>>>
>>>> I don't know whether we should just tell the people who aren't using
>>>> queued rwlocks that they have a new requirement or whether we should
>>>> say that read locks are never recursive, but having this inconsistency
>>>> is not a good idea!
>>> Actually, qrwlock is more restrictive. It is possible that systems with
>>> qrwlock may hit deadlock which doesn't happens in other systems that use
>>> recursive rwlock. However, the current lockdep code doesn't detect those
>>> cases.
>> Oops.  I misread.  Still, my point stands; we should have the same
>> definition of how you're allowed to use locks from the lockdep point of
>> view, even if the underlying implementation won't deadlock on a particular
>> usage model.
>>
> I understand your point, but such a change will require us to notify
> almost every developer using rwlocks and help them to get their code
> right, and that requires time and work, while currently I want to focus
> on the correctness of the detection, and without that being merged, we
> don't have a way to detect those problems. So I think it's better that
> we have the detection reviewed and tested for a while (given that x86
> uses qrwlock, so it will get a lot chances for testing), after that we
> we have the confidence (and the tool) to educate people the "new"
> semantics of rwlock. So I'd like to keep this patch as it is for now.

I agreed. We may have architectures that use recursive rwlocks and 
depend on that behavior in the arch specific code. So there can be false 
positive lockdep warnings when we force rwlock to use qrwlock behavior 
for all archs.

With the current patch, we can check the x86 and generic code to make 
sure that they don't have problem first. When other architectures decide 
to use the generic qrwlock later, they can find out if there are some 
inherent problems in their arch specific code.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ