[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufaR3Pbv_PgGpzfmzzCHLLwBkHT8G-RcDfe1bo0zESekPw@mail.gmail.com>
Date: Tue, 19 Apr 2022 17:22:45 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Liam Howlett <liam.howlett@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"maple-tree@...ts.infradead.org" <maple-tree@...ts.infradead.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 00/70] Introducing the Maple Tree
On Tue, Apr 19, 2022 at 5:18 PM Liam Howlett <liam.howlett@...cle.com> wrote:
>
> * Yu Zhao <yuzhao@...gle.com> [220419 17:59]:
> > On Tue, Apr 19, 2022 at 9:51 AM Liam Howlett <liam.howlett@...cle.com> wrote:
> > >
> > > * Yu Zhao <yuzhao@...gle.com> [220416 15:30]:
> > > > On Sat, Apr 16, 2022 at 9:19 AM Liam Howlett <liam.howlett@...cle.com> wrote:
> > > > >
> > > >
> > > > <snipped>
> > > >
> > > > > How did you hit this issue? Just on boot?
> > > >
> > > > I was hoping this is known to you or you have something I can verify for you.
> > >
> > >
> > > Thanks, yes. I believe that both crashes are the same root cause. The
> > > cause is that I was not cleaning up after the kmem bulk allocation
> > > failure on my side. Please test with this patch.
> >
> > Thanks. I applied this patch and hit a LOCKDEP and then a BUG_ON:
> >
> > lib/maple_tree.c:847 suspicious rcu_dereference_protected() usage!
> > Call Trace:
> > <TASK>
> > dump_stack_lvl+0x6c/0x9a
> > dump_stack+0x10/0x12
> > lockdep_rcu_suspicious+0x12c/0x140
> > __mt_destroy+0x96/0xd0
> > exit_mmap+0x2a0/0x360
> > __mmput+0x34/0x100
> > mmput+0x2f/0x40
> > free_bprm+0x64/0xe0
> > kernel_execve+0x129/0x330
> > call_usermodehelper_exec_async+0xd8/0x130
> > ? proc_cap_handler+0x210/0x210
> > ret_from_fork+0x1f/0x30
> > </TASK>
>
> Thanks - I'm not sure how this got through, but this should fix it.
>
> This should be added to 4236a642ad185 to avoid the LOCKDEP issue.
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3163,9 +3163,9 @@ void exit_mmap(struct mm_struct *mm)
>
> BUG_ON(count != mm->map_count);
>
> - mmap_write_unlock(mm);
> trace_exit_mmap(mm);
> __mt_destroy(&mm->mm_mt);
> + mmap_write_unlock(mm);
> vm_unacct_memory(nr_accounted);
> }
Will try this.
> > BUG: unable to handle page fault for address: ffffa6072aff0060
> > RIP: 0010:mab_calc_split+0x103/0x1a0
> > Code: 29 c1 8b 86 64 02 00 00 0f b6 80 dc 7d a7 96 39 c1 7e 05 83 c3
> > 01 eb 06 81 c3 ff 00 00 00 0f b6 c3 45 84 d2 74 3f 41 0f b6 ca <48> 83
> > bc ce 10 01 00 00 00 75 2d 41 83 c0 ff 41 39 c8 7e 20 0f b6
> > RSP: 0018:ffffa6072afef6d0 EFLAGS: 00010286
> > RAX: 0000000000000054 RBX: 0000000000000154 RCX: 00000000000000aa
> > RDX: ffffa6072afef83f RSI: ffffa6072afefa00 RDI: ffffa6072afefe80
> > RBP: ffffa6072afef6e0 R08: 0000000000000010 R09: 00000000000000ff
> > R10: 00000000000000aa R11: 0000000000000001 R12: 00000000000000ff
> > R13: ffffa6072afefa00 R14: ffffa6072afef9c0 R15: 0000000000000008
> > FS: 0000000001d75340(0000) GS:ffff8a56bf980000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffa6072aff0060 CR3: 00000004986ca002 CR4: 00000000001706e0
> > Call Trace:
> > <TASK>
> > mas_spanning_rebalance+0x416/0x2060
> > mas_wr_store_entry+0xa6d/0xa80
> > mas_store_gfp+0xf6/0x170
> > do_mas_align_munmap+0x32b/0x5c0
> > do_mas_munmap+0xf3/0x110
> > __vm_munmap+0xd4/0x180
> > __x64_sys_munmap+0x1b/0x20
> > do_syscall_64+0x44/0xa0
> >
> > $ ./scripts/faddr2line vmlinux mab_calc_split+0x103
> > mab_calc_split+0x103/0x1a0:
> > mab_no_null_split at lib/maple_tree.c:1787
> > (inlined by) mab_calc_split at lib/maple_tree.c:1866
>
> 1787 is "if (!b_node->slot[split]) {" Does this line up with your code?
Yes.
> How did you trigger this?
stress-ng --class vm -a 20 -t 600s -temp-path /tmpdir/ (same test
environment as in my first report)
Powered by blists - more mailing lists