[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMZfGtUG9GSRSp6fQ6AD6MFemX9ZS=XYWFceMPjVH7LATamUKg@mail.gmail.com>
Date: Fri, 25 Mar 2022 17:51:51 +0800
From: Muchun Song <songmuchun@...edance.com>
To: syzbot <syzbot+f8c45ccc7d5d45fc5965@...kaller.appspotmail.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [syzbot] general protection fault in list_lru_add
On Thu, Mar 24, 2022 at 6:03 AM syzbot
<syzbot+f8c45ccc7d5d45fc5965@...kaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 6b1f86f8e9c7 Merge tag 'folio-5.18b' of git://git.infradea..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1330b513700000
> kernel config: https://syzkaller.appspot.com/x/.config?x=b99d35252f93aed2
> dashboard link: https://syzkaller.appspot.com/bug?extid=f8c45ccc7d5d45fc5965
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142a1f25700000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1618e40b700000
>
> The issue was bisected to:
>
> commit 5abc1e37afa0335c52608d640fd30910b2eeda21
> Author: Muchun Song <songmuchun@...edance.com>
> Date: Tue Mar 22 21:41:19 2022 +0000
>
> mm: list_lru: allocate list_lru_one only when needed
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ea4c71700000
> final oops: https://syzkaller.appspot.com/x/report.txt?x=101a4c71700000
> console output: https://syzkaller.appspot.com/x/log.txt?x=17ea4c71700000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+f8c45ccc7d5d45fc5965@...kaller.appspotmail.com
> Fixes: 5abc1e37afa0 ("mm: list_lru: allocate list_lru_one only when needed")
>
> general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> CPU: 0 PID: 2964 Comm: udevd Tainted: G W 5.17.0-syzkaller-02172-g6b1f86f8e9c7 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:list_add_tail include/linux/list.h:102 [inline]
> RIP: 0010:list_lru_add+0x277/0x510 mm/list_lru.c:129
> Code: 04 64 4d 8d 7c c7 10 4c 89 3c 24 e8 c3 f6 ca ff 49 8d 47 08 48 89 c2 48 89 44 24 10 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4d 02 00 00 4d 8b 77 08 48 89 df 48 8b 14 24 4c
> RSP: 0018:ffffc90002c17db0 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: ffff88823bc54fc0 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffffffff81acf7ad RDI: ffffffff8d93ddd0
> RBP: ffff8880256da7f0 R08: 0000000000000000 R09: ffffffff8d93ddd7
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> R13: ffff88807fb2a880 R14: 0000000000000080 R15: 0000000000000000
> FS: 00007f711b82e840(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f565fc7a718 CR3: 000000001a735000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> d_lru_add fs/dcache.c:431 [inline]
> retain_dentry fs/dcache.c:685 [inline]
> dput+0x7a7/0xdb0 fs/dcache.c:908
> __fput+0x3ab/0x9f0 fs/file_table.c:330
> task_work_run+0xdd/0x1a0 kernel/task_work.c:164
> tracehook_notify_resume include/linux/tracehook.h:188 [inline]
> exit_to_user_mode_loop kernel/entry/common.c:190 [inline]
> exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:222
> __syscall_exit_to_user_mode_work kernel/entry/common.c:304 [inline]
> syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:315
> do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f711b92a467
> Code: 44 00 00 48 8b 15 11 aa 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 e1 a9 0c 00 f7 d8 64 89 02 b8
> RSP: 002b:00007ffe9fd16aa8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
> RAX: 0000000000000000 RBX: 000055991078b240 RCX: 00007f711b92a467
> RDX: 00007f711b9f1780 RSI: 0000000000000000 RDI: 000000000000000c
> RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f711b9f5a60
> R10: 0000000000000200 R11: 0000000000000202 R12: 00007f711b9f2380
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:list_add_tail include/linux/list.h:102 [inline]
> RIP: 0010:list_lru_add+0x277/0x510 mm/list_lru.c:129
> Code: 04 64 4d 8d 7c c7 10 4c 89 3c 24 e8 c3 f6 ca ff 49 8d 47 08 48 89 c2 48 89 44 24 10 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4d 02 00 00 4d 8b 77 08 48 89 df 48 8b 14 24 4c
> RSP: 0018:ffffc90002c17db0 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: ffff88823bc54fc0 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffffffff81acf7ad RDI: ffffffff8d93ddd0
> RBP: ffff8880256da7f0 R08: 0000000000000000 R09: ffffffff8d93ddd7
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> R13: ffff88807fb2a880 R14: 0000000000000080 R15: 0000000000000000
> FS: 00007f711b82e840(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f565fc7a718 CR3: 000000001a735000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess):
> 0: 04 64 add $0x64,%al
> 2: 4d 8d 7c c7 10 lea 0x10(%r15,%rax,8),%r15
> 7: 4c 89 3c 24 mov %r15,(%rsp)
> b: e8 c3 f6 ca ff callq 0xffcaf6d3
> 10: 49 8d 47 08 lea 0x8(%r15),%rax
> 14: 48 89 c2 mov %rax,%rdx
> 17: 48 89 44 24 10 mov %rax,0x10(%rsp)
> 1c: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
> 23: fc ff df
> 26: 48 c1 ea 03 shr $0x3,%rdx
> * 2a: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
> 2e: 0f 85 4d 02 00 00 jne 0x281
> 34: 4d 8b 77 08 mov 0x8(%r15),%r14
> 38: 48 89 df mov %rbx,%rdi
> 3b: 48 8b 14 24 mov (%rsp),%rdx
> 3f: 4c rex.WR
>
>
I can reproduce this (base commit: 5abc1e37afa0335c52608d640fd30910b2eeda21)
on my machine locally and have some updates on this. I added the following
patch to print more infos.
We can see that we put the dentry (ffff88807ebda0f8) into
the list_lru (ffff888011bd47f0). But we do not allocate struct
list_lru_one for the memcg (ffff88801c530000). Then it panics.
I have added a pr_info into memcg_list_lru_alloc() which
will print the address of struct list_lru which we have
allocated struct list_lru_one for. However, I cannot find
this print info in the full dmesg (see the attachment).
It seems that this address (ffff88807ebda0f8) is not
allocated by kmem_cache_alloc_lru(). But I haven't
found the root cause. I will continue to investigate.
diff --git a/mm/list_lru.c b/mm/list_lru.c
index fc938d8ff48f..9dd9424cea4f 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -39,6 +39,7 @@ static void list_lru_unregister(struct list_lru *lru)
if (!list_lru_memcg_aware(lru))
return;
+ pr_info("smcdef: list_lru_unregister: %px\n", lru);
mutex_lock(&list_lrus_mutex);
list_del(&lru->list);
mutex_unlock(&list_lrus_mutex);
@@ -76,6 +77,7 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr,
struct list_lru_node *nlru = &lru->node[nid];
struct list_lru_one *l = &nlru->lru;
struct mem_cgroup *memcg = NULL;
+ int kmemcg_id;
if (!lru->mlrus)
goto out;
@@ -84,7 +86,16 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr,
if (!memcg)
goto out;
- l = list_lru_from_memcg_idx(lru, nid, memcg_cache_id(memcg));
+ kmemcg_id = memcg_cache_id(memcg);
+ l = list_lru_from_memcg_idx(lru, nid, kmemcg_id);
+ if (!l) {
+ pr_info("the memcg(%px)->objcg(%px), kmemcg_id: %d,
ptr: %px, lru: %px\n",
+ memcg, memcg->objcg, kmemcg_id, ptr, lru);
+ memcg = parent_mem_cgroup(memcg);
+ kmemcg_id = memcg_cache_id(memcg);
+ pr_info("the memcg(%px)->objcg(%px), kmemcg_id: %d, lru: %px\n",
+ memcg, memcg->objcg, kmemcg_id,
list_lru_from_memcg_idx(lru, nid, kmemcg_id));
+ }
out:
if (memcg_ptr)
*memcg_ptr = memcg;
@@ -503,6 +514,8 @@ void memcg_drain_all_list_lrus(struct mem_cgroup
*src, struct mem_cgroup *dst)
struct list_lru *lru;
int src_idx = src->kmemcg_id;
+ pr_info("smcdef offline src: %px(%d), dst: %px(%d)\n", src,
src_idx, dst, dst->kmemcg_id);
+
/*
* Change kmemcg_id of this cgroup and all its descendants to the
* parent's id, and then move all entries from this cgroup's list_lrus
@@ -567,12 +580,14 @@ int memcg_list_lru_alloc(struct mem_cgroup
*memcg, struct list_lru *lru,
if (!table)
return -ENOMEM;
+ pr_info("smcdef memcg->css.cgroup->level: %d, lru: %px\n",
memcg->css.cgroup->level, lru);
/*
* Because the list_lru can be reparented to the parent cgroup's
* list_lru, we should make sure that this cgroup and all its
* ancestors have allocated list_lru_per_memcg.
*/
for (i = 0; memcg; memcg = parent_mem_cgroup(memcg), i++) {
+ pr_info("smcdef memcg: %px, kmemcg_id: %d\n", memcg,
memcg->kmemcg_id);
if (memcg_list_lru_allocated(memcg, lru))
break;
@@ -592,10 +607,13 @@ int memcg_list_lru_alloc(struct mem_cgroup
*memcg, struct list_lru *lru,
int index = table[i].memcg->kmemcg_id;
struct list_lru_per_memcg *mlru = table[i].mlru;
- if (index < 0 ||
rcu_dereference_protected(mlrus->mlru[index], true))
+ if (index < 0 ||
rcu_dereference_protected(mlrus->mlru[index], true)) {
kfree(mlru);
- else
+ pr_info("smcdef allocated i: %d, memcg: %px,
kmemcg_id: %d\n", i, memcg, index);
+ } else {
+ pr_info("smcdef new i: %d, memcg: %px,
kmemcg_id: %d\n", i, memcg, index);
rcu_assign_pointer(mlrus->mlru[index], mlru);
+ }
}
spin_unlock_irqrestore(&lru->lock, flags);
View attachment "file.txt" of type "text/plain" (256749 bytes)
Powered by blists - more mailing lists