[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b3df7a8-c7a1-895-4fad-157ec95b8e0@google.com>
Date: Sat, 22 Jul 2023 22:13:36 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: syzbot <syzbot+fe7b1487405295d29268@...kaller.appspotmail.com>
cc: akpm@...ux-foundation.org, hughd@...gle.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, luto@...nel.org,
peterz@...radead.org, syzkaller-bugs@...glegroups.com,
tglx@...utronix.de
Subject: Re: [syzbot] [mm?] kernel BUG in collapse_file (3)
On Mon, 17 Jul 2023, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: e32622656258 Add linux-next specific files for 20230713
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=16cd037aa80000
> kernel config: https://syzkaller.appspot.com/x/.config?x=55a2f8abfda98f31
> dashboard link: https://syzkaller.appspot.com/bug?extid=fe7b1487405295d29268
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=131922e4a80000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14277fd8a80000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/d1c2a7ce287f/disk-e3262265.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/2041e3e43285/vmlinux-e3262265.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/44f789cdae5d/bzImage-e3262265.xz
>
> The issue was bisected to:
>
> commit 49a44d59344d1a6a4cc841d6e4a8727f99ed97bf
> Author: Hugh Dickins <hughd@...gle.com>
> Date: Wed Jul 12 04:42:19 2023 +0000
>
> mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=105af56aa80000
> final oops: https://syzkaller.appspot.com/x/report.txt?x=125af56aa80000
> console output: https://syzkaller.appspot.com/x/log.txt?x=145af56aa80000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+fe7b1487405295d29268@...kaller.appspotmail.com
> Fixes: 49a44d59344d ("mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()")
>
> ------------[ cut here ]------------
> kernel BUG at mm/khugepaged.c:1785!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 5058 Comm: syz-executor181 Not tainted 6.5.0-rc1-next-20230713-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
> RIP: 0010:collapse_file+0x1150/0x5510 mm/khugepaged.c:1785
> Code: 89 c6 e8 e3 67 a6 ff 84 db 0f 85 66 f1 ff ff e8 a6 6c a6 ff 0f 0b e9 5a f1 ff ff c6 44 24 48 00 e9 65 f0 ff ff e8 90 6c a6 ff <0f> 0b e8 89 6c a6 ff 4d 85 ed 74 1c e8 7f 6c a6 ff 44 89 eb 31 ff
> RSP: 0018:ffffc90003bff810 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: 00000000000000ff RCX: 0000000000000000
> RDX: ffff88807e618000 RSI: ffffffff81df5fb0 RDI: 0000000000000007
> RBP: 0000000777fa80ff R08: 0000000000000007 R09: 0000000000000000
> R10: 00000000000000ff R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: ffff8880227b3680 R15: 0000000777fa7eff
> FS: 00007fdc40a816c0(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fdc40b169f8 CR3: 00000000278a9000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> hpage_collapse_scan_file+0xc8d/0x1650 mm/khugepaged.c:2285
> madvise_collapse+0x52c/0xb50 mm/khugepaged.c:2729
> madvise_vma_behavior+0x200/0x1e60 mm/madvise.c:1094
> madvise_walk_vmas+0x1c6/0x2b0 mm/madvise.c:1268
> do_madvise.part.0+0x29c/0x5d0 mm/madvise.c:1448
> do_madvise mm/madvise.c:1461 [inline]
> __do_sys_madvise mm/madvise.c:1461 [inline]
> __se_sys_madvise mm/madvise.c:1459 [inline]
> __x64_sys_madvise+0x115/0x150 mm/madvise.c:1459
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fdc40ac0399
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fdc40a81238 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
> RAX: ffffffffffffffda RBX: 00007fdc40b4a308 RCX: 00007fdc40ac0399
> RDX: 0000000000000019 RSI: 000000000040c101 RDI: 0000000020000000
> RBP: 00007fdc40b4a300 R08: 00007fdc40a816c0 R09: 00007fdc40a816c0
> R10: 00007fdc40a816c0 R11: 0000000000000246 R12: 00007fdc40b4a30c
> R13: 0000000000000000 R14: 00007fffbeb44cf0 R15: 00007fffbeb44dd8
> </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:collapse_file+0x1150/0x5510 mm/khugepaged.c:1785
> Code: 89 c6 e8 e3 67 a6 ff 84 db 0f 85 66 f1 ff ff e8 a6 6c a6 ff 0f 0b e9 5a f1 ff ff c6 44 24 48 00 e9 65 f0 ff ff e8 90 6c a6 ff <0f> 0b e8 89 6c a6 ff 4d 85 ed 74 1c e8 7f 6c a6 ff 44 89 eb 31 ff
> RSP: 0018:ffffc90003bff810 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: 00000000000000ff RCX: 0000000000000000
> RDX: ffff88807e618000 RSI: ffffffff81df5fb0 RDI: 0000000000000007
> RBP: 0000000777fa80ff R08: 0000000000000007 R09: 0000000000000000
> R10: 00000000000000ff R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: ffff8880227b3680 R15: 0000000777fa7eff
> FS: 00007fdc40a816c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fdc40a60d58 CR3: 00000000278a9000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
This was a very helpful report from syzbot (not all of them are, I know ;)
kernel BUG at mm/khugepaged.c:1785! in that tree was the
VM_BUG_ON(start & (HPAGE_PMD_NR - 1));
on coming in to collapse_file(). Which seems an unlikely thing to get
wrong, and I couldn't see why, and the repro did not repro for me.
I wouldn't usually bother to look at the linked bisection log
https://syzkaller.appspot.com/x/bisect.txt?x=105af56aa80000
but in this case it was very instructive. My first reaction to
the kinds of crash it was showing (__fput, task_work_run, hardly
any in collapse_file) made me think the bisection had gone off course.
But no: they all point to fput(), hence vma->vm_file, and my guilty
commit was blithely setting "mmap_locked = true", without realizing
that that setting is supposed to give guarantees that "vma" has been
revalidated since the mmap_lock was taken - not so.
Patch for mm-unstable follows with some others tomorrow.
Hugh
Powered by blists - more mailing lists