linux-kernel - Re: [PATCH v3 04/20] mm: VMA sequence count

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <4e6c4e45-bbd6-3fd8-ee96-216892c797b3@linux.vnet.ibm.com>
Date:   Fri, 15 Sep 2017 14:38:51 +0200
From:   Laurent Dufour <ldufour@...ux.vnet.ibm.com>
To:     Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:     paulmck@...ux.vnet.ibm.com, peterz@...radead.org,
        akpm@...ux-foundation.org, kirill@...temov.name,
        ak@...ux.intel.com, mhocko@...nel.org, dave@...olabs.net,
        jack@...e.cz, Matthew Wilcox <willy@...radead.org>,
        benh@...nel.crashing.org, mpe@...erman.id.au, paulus@...ba.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, hpa@...or.com,
        Will Deacon <will.deacon@....com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        haren@...ux.vnet.ibm.com, khandual@...ux.vnet.ibm.com,
        npiggin@...il.com, bsingharora@...il.com,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        linuxppc-dev@...ts.ozlabs.org, x86@...nel.org
Subject: Re: [PATCH v3 04/20] mm: VMA sequence count

Hi,

On 14/09/2017 11:40, Sergey Senozhatsky wrote:
> On (09/14/17 11:15), Laurent Dufour wrote:
>> On 14/09/2017 11:11, Sergey Senozhatsky wrote:
>>> On (09/14/17 10:58), Laurent Dufour wrote:
>>> [..]
>>>> That's right, but here this is the  sequence counter mm->mm_seq, not the
>>>> vm_seq one.
>>>
>>> d'oh... you are right.
>>
>> So I'm doubting about the probability of a deadlock here, but I don't like
>> to see lockdep complaining. Is there an easy way to make it happy ?
> 
> 
>  /*
>   * well... answering your question - it seems raw versions of seqcount
>   * functions don't call lockdep's lock_acquire/lock_release...
>   *
>   * but I have never told you that. never.
>   */

Hum... I'm not sure that would be the best way since in other case lockdep
checks are valid, but if getting rid of locked's warning is required to get
this series upstream, I'd use raw versions... Please advice...

> 
> lockdep, perhaps, can be wrong sometimes, and may be it's one of those
> cases. may be not... I'm not a MM guy myself.

>From the code reading I can't see any valid reason for a circular lock
dependency.

> below is a lockdep splat I got yesterday. that's v3 of SPF patch set.

This is exactly the same you got previously, and I still can't see how the
chain "&mapping->i_mmap_rwsem --> &vma->vm_sequence/1 --> fs_reclaim" could
happen.

Cheers,
Laurent.

> 
> [ 2763.365898] ======================================================
> [ 2763.365899] WARNING: possible circular locking dependency detected
> [ 2763.365902] 4.13.0-next-20170913-dbg-00039-ge3c06ea4b028-dirty #1837 Not tainted
> [ 2763.365903] ------------------------------------------------------
> [ 2763.365905] khugepaged/42 is trying to acquire lock:
> [ 2763.365906]  (&mapping->i_mmap_rwsem){++++}, at: [<ffffffff811181cc>] rmap_walk_file+0x5a/0x142
> [ 2763.365913] 
>                but task is already holding lock:
> [ 2763.365915]  (fs_reclaim){+.+.}, at: [<ffffffff810e99dc>] fs_reclaim_acquire+0x12/0x35
> [ 2763.365920] 
>                which lock already depends on the new lock.
> 
> [ 2763.365922] 
>                the existing dependency chain (in reverse order) is:
> [ 2763.365924] 
>                -> #3 (fs_reclaim){+.+.}:
> [ 2763.365930]        lock_acquire+0x176/0x19e
> [ 2763.365932]        fs_reclaim_acquire+0x32/0x35
> [ 2763.365934]        __alloc_pages_nodemask+0x6d/0x1f9
> [ 2763.365937]        pte_alloc_one+0x17/0x62
> [ 2763.365940]        __pte_alloc+0x1f/0x83
> [ 2763.365943]        move_page_tables+0x2c3/0x5a2
> [ 2763.365944]        move_vma.isra.25+0xff/0x29f
> [ 2763.365946]        SyS_mremap+0x41b/0x49e
> [ 2763.365949]        entry_SYSCALL_64_fastpath+0x18/0xad
> [ 2763.365951] 
>                -> #2 (&vma->vm_sequence/1){+.+.}:
> [ 2763.365955]        lock_acquire+0x176/0x19e
> [ 2763.365958]        write_seqcount_begin_nested+0x1b/0x1d
> [ 2763.365959]        __vma_adjust+0x1c4/0x5f1
> [ 2763.365961]        __split_vma+0x12c/0x181
> [ 2763.365963]        do_munmap+0x128/0x2af
> [ 2763.365965]        vm_munmap+0x5a/0x73
> [ 2763.365968]        elf_map+0xb1/0xce
> [ 2763.365970]        load_elf_binary+0x91e/0x137a
> [ 2763.365973]        search_binary_handler+0x70/0x1f3
> [ 2763.365974]        do_execveat_common+0x45e/0x68e
> [ 2763.365978]        call_usermodehelper_exec_async+0xf7/0x11f
> [ 2763.365980]        ret_from_fork+0x27/0x40
> [ 2763.365981] 
>                -> #1 (&vma->vm_sequence){+.+.}:
> [ 2763.365985]        lock_acquire+0x176/0x19e
> [ 2763.365987]        write_seqcount_begin_nested+0x1b/0x1d
> [ 2763.365989]        __vma_adjust+0x1a9/0x5f1
> [ 2763.365991]        __split_vma+0x12c/0x181
> [ 2763.365993]        do_munmap+0x128/0x2af
> [ 2763.365994]        vm_munmap+0x5a/0x73
> [ 2763.365996]        elf_map+0xb1/0xce
> [ 2763.365998]        load_elf_binary+0x91e/0x137a
> [ 2763.365999]        search_binary_handler+0x70/0x1f3
> [ 2763.366001]        do_execveat_common+0x45e/0x68e
> [ 2763.366003]        call_usermodehelper_exec_async+0xf7/0x11f
> [ 2763.366005]        ret_from_fork+0x27/0x40
> [ 2763.366006] 
>                -> #0 (&mapping->i_mmap_rwsem){++++}:
> [ 2763.366010]        __lock_acquire+0xa72/0xca0
> [ 2763.366012]        lock_acquire+0x176/0x19e
> [ 2763.366015]        down_read+0x3b/0x55
> [ 2763.366017]        rmap_walk_file+0x5a/0x142
> [ 2763.366018]        page_referenced+0xfc/0x134
> [ 2763.366022]        shrink_active_list+0x1ac/0x37d
> [ 2763.366024]        shrink_node_memcg.constprop.72+0x3ca/0x567
> [ 2763.366026]        shrink_node+0x3f/0x14c
> [ 2763.366028]        try_to_free_pages+0x288/0x47a
> [ 2763.366030]        __alloc_pages_slowpath+0x3a7/0xa49
> [ 2763.366032]        __alloc_pages_nodemask+0xf1/0x1f9
> [ 2763.366035]        khugepaged+0xc8/0x167c
> [ 2763.366037]        kthread+0x133/0x13b
> [ 2763.366039]        ret_from_fork+0x27/0x40
> [ 2763.366040] 
>                other info that might help us debug this:
> 
> [ 2763.366042] Chain exists of:
>                  &mapping->i_mmap_rwsem --> &vma->vm_sequence/1 --> fs_reclaim
> 
> [ 2763.366048]  Possible unsafe locking scenario:
> 
> [ 2763.366049]        CPU0                    CPU1
> [ 2763.366050]        ----                    ----
> [ 2763.366051]   lock(fs_reclaim);
> [ 2763.366054]                                lock(&vma->vm_sequence/1);
> [ 2763.366056]                                lock(fs_reclaim);
> [ 2763.366058]   lock(&mapping->i_mmap_rwsem);
> [ 2763.366061] 
>                 *** DEADLOCK ***
> 
> [ 2763.366063] 1 lock held by khugepaged/42:
> [ 2763.366064]  #0:  (fs_reclaim){+.+.}, at: [<ffffffff810e99dc>] fs_reclaim_acquire+0x12/0x35
> [ 2763.366068] 
>                stack backtrace:
> [ 2763.366071] CPU: 2 PID: 42 Comm: khugepaged Not tainted 4.13.0-next-20170913-dbg-00039-ge3c06ea4b028-dirty #1837
> [ 2763.366073] Call Trace:
> [ 2763.366077]  dump_stack+0x67/0x8e
> [ 2763.366080]  print_circular_bug+0x2a1/0x2af
> [ 2763.366083]  ? graph_unlock+0x69/0x69
> [ 2763.366085]  check_prev_add+0x76/0x20d
> [ 2763.366087]  ? graph_unlock+0x69/0x69
> [ 2763.366090]  __lock_acquire+0xa72/0xca0
> [ 2763.366093]  ? __save_stack_trace+0xa3/0xbf
> [ 2763.366096]  lock_acquire+0x176/0x19e
> [ 2763.366098]  ? rmap_walk_file+0x5a/0x142
> [ 2763.366100]  down_read+0x3b/0x55
> [ 2763.366102]  ? rmap_walk_file+0x5a/0x142
> [ 2763.366103]  rmap_walk_file+0x5a/0x142
> [ 2763.366106]  page_referenced+0xfc/0x134
> [ 2763.366108]  ? page_vma_mapped_walk_done.isra.17+0xb/0xb
> [ 2763.366109]  ? page_get_anon_vma+0x6d/0x6d
> [ 2763.366112]  shrink_active_list+0x1ac/0x37d
> [ 2763.366115]  shrink_node_memcg.constprop.72+0x3ca/0x567
> [ 2763.366118]  ? ___might_sleep+0xd5/0x234
> [ 2763.366121]  shrink_node+0x3f/0x14c
> [ 2763.366123]  try_to_free_pages+0x288/0x47a
> [ 2763.366126]  __alloc_pages_slowpath+0x3a7/0xa49
> [ 2763.366128]  ? ___might_sleep+0xd5/0x234
> [ 2763.366131]  __alloc_pages_nodemask+0xf1/0x1f9
> [ 2763.366133]  khugepaged+0xc8/0x167c
> [ 2763.366138]  ? remove_wait_queue+0x47/0x47
> [ 2763.366140]  ? collapse_shmem.isra.45+0x828/0x828
> [ 2763.366142]  kthread+0x133/0x13b
> [ 2763.366145]  ? __list_del_entry+0x1d/0x1d
> [ 2763.366147]  ret_from_fork+0x27/0x40
> 
> 	-ss
>