linux-kernel - Re: possible deadlock in lru_add_drain

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+Zi_Gqh1V7QHzUdRuYQAtNjyNU2awcPOHSQYw9TsCwEsw@mail.gmail.com>
Date:   Tue, 31 Oct 2017 16:55:32 +0300
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Michal Hocko <mhocko@...nel.org>,
        Byungchul Park <byungchul.park@....com>,
        syzbot 
        <bot+e7353c7141ff7cbb718e4c888a14fa92de41ebaa@...kaller.appspotmail.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Johannes Weiner <hannes@...xchg.org>, Jan Kara <jack@...e.cz>,
        jglisse@...hat.com, LKML <linux-kernel@...r.kernel.org>,
        linux-mm@...ck.org, shli@...com, syzkaller-bugs@...glegroups.com,
        Thomas Gleixner <tglx@...utronix.de>,
        Vlastimil Babka <vbabka@...e.cz>, ying.huang@...el.com,
        kernel-team@....com
Subject: Re: possible deadlock in lru_add_drain_all

On Tue, Oct 31, 2017 at 4:51 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:
>> On Mon 30-10-17 16:10:09, Peter Zijlstra wrote:
>
>> > However, that splat translates like:
>> >
>> >     __cpuhp_setup_state()
>> > #0    cpus_read_lock()
>> >       __cpuhp_setup_state_cpuslocked()
>> > #1      mutex_lock(&cpuhp_state_mutex)
>> >
>> >
>> >
>> >     __cpuhp_state_add_instance()
>> > #2    mutex_lock(&cpuhp_state_mutex)
>>
>> this should be #1 right?
>
> Yes
>
>> >       cpuhp_issue_call()
>> >         cpuhp_invoke_ap_callback()
>> > #3        wait_for_completion()
>> >
>> >                                             msr_device_create()
>> >                                               ...
>> > #4                                              filename_create()
>> > #3                                          complete()
>> >
>> >
>> >
>> >     do_splice()
>> > #4    file_start_write()
>> >       do_splice_from()
>> >         iter_file_splice_write()
>> > #5        pipe_lock()
>> >           vfs_iter_write()
>> >             ...
>> > #6            inode_lock()
>> >
>> >
>> >
>> >     sys_fcntl()
>> >       do_fcntl()
>> >         shmem_fcntl()
>> > #5        inode_lock()
>
> And that #6
>
>> >           shmem_wait_for_pins()
>> >             if (!scan)
>> >               lru_add_drain_all()
>> > #0              cpus_read_lock()
>> >
>> >
>> >
>> > Which is an actual real deadlock, there is no mixing of up and down.
>>
>> thanks a lot, this made it more clear to me. It took a while to
>> actually see 0 -> 1 -> 3 -> 4 -> 5 -> 0 cycle. I have only focused
>> on lru_add_drain_all while it was holding the cpus lock.
>
> Yeah, these things are a pain to read, which is why I always construct
> something like the above first.


I noticed that for a simple 2 lock deadlock lockdep prints only 2
stacks. FWIW in user-space TSAN we print 4 stacks for such deadlocks,
namely where A was locked, where B was locked under A, where B was
locked, where A was locked under B. It makes it easier to figure out
what happens. However, for this report it seems to be 8 stacks this
way. So it's probably hard either way.