linux-kernel - Re: [syzbot] [mm?] general protection fault in do_migrate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa9d2da7-f547-4731-af84-b4871e588bae@redhat.com>
Date: Wed, 20 Nov 2024 19:38:39 +0100
From: David Hildenbrand <david@...hat.com>
To: syzbot <syzbot+3511625422f7aa637f0d@...kaller.appspotmail.com>,
 akpm@...ux-foundation.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [mm?] general protection fault in do_migrate_pages

On 20.11.24 19:11, David Hildenbrand wrote:
> On 20.11.24 17:39, David Hildenbrand wrote:
>> On 20.11.24 16:38, David Hildenbrand wrote:
>>> On 20.11.24 01:00, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    f868cd251776 Merge tag 'drm-fixes-2024-11-16' of https://g..
>>>> git tree:       upstream
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=15473cc0580000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=ff8e8187a30080b5
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=3511625422f7aa637f0d
>>>> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17e8d130580000
>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=159c71a7980000
>>>>
>>>> Downloadable assets:
>>>> disk image: https://storage.googleapis.com/syzbot-assets/a0d46da55993/disk-f868cd25.raw.xz
>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/da57ef4813fd/vmlinux-f868cd25.xz
>>>> kernel image: https://storage.googleapis.com/syzbot-assets/3cdde892ea08/bzImage-f868cd25.xz
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+3511625422f7aa637f0d@...kaller.appspotmail.com
>>>>
>>>> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
>>>> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
>>>> CPU: 1 UID: 0 PID: 6021 Comm: syz-executor284 Not tainted 6.12.0-rc7-syzkaller-00187-gf868cd251776 #0
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024
>>>> RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline]
>>>> RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194
>>>> Code: 8b 54 24 30 41 83 c8 10 80 3a 00 4d 63 c0 0f 85 d1 02 00 00 48 89 c1 48 8b 54 24 18 48 be 00 00 00 00 00 fc ff df 48 c1 e9 03 <80> 3c 31 00 48 8b 92 b0 00 00 00 0f 85 74 02 00 00 48 8b 30 49 89
>>>> RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246
>>>> RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000
>>>> RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044
>>>> RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1
>>>> R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003
>>>> R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8
>>>> FS:  00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> Call Trace:
>>>>      <TASK>
>>>>      kernel_migrate_pages+0x5b2/0x750 mm/mempolicy.c:1709
>>>>      __do_sys_migrate_pages mm/mempolicy.c:1727 [inline]
>>>>      __se_sys_migrate_pages mm/mempolicy.c:1723 [inline]
>>>>      __x64_sys_migrate_pages+0x96/0x100 mm/mempolicy.c:1723
>>>>      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>>      do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
>>>>      entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>>> RIP: 0033:0x7fedcca74af9
>>>> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
>>>> RSP: 002b:00007ffe4d85c278 EFLAGS: 00000206 ORIG_RAX: 0000000000000100
>>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fedcca74af9
>>>> RDX: 0000000020000000 RSI: 000000000000005a RDI: 0000000000001786
>>>> RBP: 0000000000010bf2 R08: 0000000000006080 R09: 0000000000000006
>>>> R10: 0000000020000040 R11: 0000000000000206 R12: 00007ffe4d85c28c
>>>> R13: 431bde82d7b634db R14: 0000000000000001 R15: 0000000000000001
>>>>      </TASK>
>>>> Modules linked in:
>>>> ---[ end trace 0000000000000000 ]---
>>>> RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline]
>>>> RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194
>>>> Code: 8b 54 24 30 41 83 c8 10 80 3a 00 4d 63 c0 0f 85 d1 02 00 00 48 89 c1 48 8b 54 24 18 48 be 00 00 00 00 00 fc ff df 48 c1 e9 03 <80> 3c 31 00 48 8b 92 b0 00 00 00 0f 85 74 02 00 00 48 8b 30 49 89
>>>> RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246
>>>> RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000
>>>> RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044
>>>> RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1
>>>> R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003
>>>> R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8
>>>> FS:  00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> ----------------
>>>> Code disassembly (best guess):
>>>>        0:	8b 54 24 30          	mov    0x30(%rsp),%edx
>>>>        4:	41 83 c8 10          	or     $0x10,%r8d
>>>>        8:	80 3a 00             	cmpb   $0x0,(%rdx)
>>>>        b:	4d 63 c0             	movslq %r8d,%r8
>>>>        e:	0f 85 d1 02 00 00    	jne    0x2e5
>>>>       14:	48 89 c1             	mov    %rax,%rcx
>>>>       17:	48 8b 54 24 18       	mov    0x18(%rsp),%rdx
>>>>       1c:	48 be 00 00 00 00 00 	movabs $0xdffffc0000000000,%rsi
>>>>       23:	fc ff df
>>>>       26:	48 c1 e9 03          	shr    $0x3,%rcx
>>>> * 2a:	80 3c 31 00          	cmpb   $0x0,(%rcx,%rsi,1) <-- trapping instruction
>>>>       2e:	48 8b 92 b0 00 00 00 	mov    0xb0(%rdx),%rdx
>>>>       35:	0f 85 74 02 00 00    	jne    0x2af
>>>>       3b:	48 8b 30             	mov    (%rax),%rsi
>>>>       3e:	49                   	rex.WB
>>>>       3f:	89                   	.byte 0x89
>>>>
>>>
>>> Hmmm, there is not much meat in this report :)
>>>
>>> The reproducer seems to execute migrate_pages() in a fork'ed child
>>> process, and kills that process after a while. Not 100% sure if the
>>> concurrent killing of the process is relevant.
>>>
>>> Before the child process calls migrate_pages(), it executes
>>> MADV_DONTFORK on the complete address space (funny, I wonder what that
>>> does ...) and then calls clone3() without CLONE_VM.
>>>
>>
>> After running it for a while in a VM with the given config:
>>
>> [  827.514143][T37171] Oops: general protection fault, probably for
>> non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
>> [  827.516614][T37171] KASAN: null-ptr-deref in range
>> [0x0000000000000000-0x0000000000000007]
>> [  827.518162][T37171] CPU: 4 UID: 0 PID: 37171 Comm: repro4 Not tainted
>> 6.12.0-rc7-00187-gf868cd251776 #99
>> [  827.519935][T37171] Hardware name: QEMU Standard PC (Q35 + ICH9,
>> 2009), BIOS 1.16.3-2.fc40 04/01/2014
>> [  827.521648][T37171] RIP: 0010:do_migrate_pages+0x404/0x6e0
>> [  827.522774][T37171] Code: 10 80 39 00 4d 63 c0 0f 85 9b 02 00 00 48
>> be 00 00 00 00 00 fc ff df 48 8b 4c 24 28 48 8b 91 b0 00 00 00 48 89 c1
>> 48 c1 e9 03 <80> 3c 31 00 0f 85 95 02 00 00 48 8b 30 49 89 d9 48 8b 4c
>> 24 08 48
>> [  827.526342][T37171] RSP: 0018:ffffc90028157ce8 EFLAGS: 00010256
>> [  827.527480][T37171] RAX: 0000000000000000 RBX: ffffc90028157d68 RCX:
>> 0000000000000000
>> [  827.528942][T37171] RDX: 00007ffffffff000 RSI: dffffc0000000000 RDI:
>> ffff88811dcd8444
>> [  827.530406][T37171] RBP: 0000000000000003 R08: 0000000000000014 R09:
>> ffff88811dcd8ad8
>> [  827.531865][T37171] R10: ffffffff903e668f R11: 0000000000000000 R12:
>> ffffc90028157e80
>> [  827.533341][T37171] R13: ffff8881f3a2b0a8 R14: ffffc90028157e28 R15:
>> ffffc90028157e88
>> [  827.534806][T37171] FS:  00007f096d49f740(0000)
>> GS:ffff8881f4a00000(0000) knlGS:0000000000000000
>> [  827.536452][T37171] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  827.537672][T37171] CR2: 00007ff2dcb96810 CR3: 00000001eed18000 CR4:
>> 0000000000750ef0
>> [  827.539135][T37171] PKRU: 55555554
>> [  827.539799][T37171] Call Trace:
>> [  827.540407][T37171]  <TASK>
>> [  827.540965][T37171]  ? die_addr.cold+0x8/0xd
>> [  827.541823][T37171]  ? exc_general_protection+0x147/0x240
>> [  827.542888][T37171]  ? asm_exc_general_protection+0x26/0x30
>> [  827.543960][T37171]  ? do_migrate_pages+0x404/0x6e0
>> [  827.544915][T37171]  ? do_migrate_pages+0x3cd/0x6e0
>> [  827.545873][T37171]  ? __pfx_do_migrate_pages+0x10/0x10
>> [  827.546895][T37171]  ? do_raw_spin_lock+0x12a/0x2b0
>> [  827.547854][T37171]  ? apparmor_capable+0x11c/0x3b0
>> [  827.548818][T37171]  ? srso_alias_return_thunk+0x5/0xfbef5
>> [  827.549878][T37171]  ? srso_alias_return_thunk+0x5/0xfbef5
>> [  827.550937][T37171]  ? security_capable+0x80/0x260
>> [  827.551893][T37171]  kernel_migrate_pages+0x5b7/0x750
>> [  827.552891][T37171]  ? __pfx_kernel_migrate_pages+0x10/0x10
>> [  827.553975][T37171]  ? srso_alias_return_thunk+0x5/0xfbef5
>> [  827.555028][T37171]  ? rcu_is_watching+0x12/0xc0
>> [  827.555938][T37171]  ? srso_alias_return_thunk+0x5/0xfbef5
>> [  827.557000][T37171]  __x64_sys_migrate_pages+0x96/0x100
>> [  827.558022][T37171]  ? srso_alias_return_thunk+0x5/0xfbef5
>> [  827.559077][T37171]  ? lockdep_hardirqs_on+0x7b/0x110
>> [  827.560052][T37171]  do_syscall_64+0xc7/0x250
>> [  827.560909][T37171]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> .. digging further, we call migrate_pages() with the pid of a process
> we created using clone3(!CLONE_VM).
> 
> The crashing code is likely:
> 
>           vma = find_vma(mm, 0);
>       722c:       e8 00 00 00 00          call   7231 <do_migrate_pages+0x3c1>
>       7231:       48 8b 7c 24 28          mov    0x28(%rsp),%rdi
>       7236:       31 f6                   xor    %esi,%esi
>       7238:       e8 00 00 00 00          call   723d <do_migrate_pages+0x3cd>
>                                         flags | MPOL_MF_DISCONTIG_OK, &pagelist);
>       723d:       44 8b 44 24 3c          mov    0x3c(%rsp),%r8d
>           nr_failed = queue_pages_range(mm, vma->vm_start, mm->task_size, &nmask,
>       7242:       48 8b 4c 24 40          mov    0x40(%rsp),%rcx
>                                         flags | MPOL_MF_DISCONTIG_OK, &pagelist);
>       7247:       41 83 c8 10             or     $0x10,%r8d
>           nr_failed = queue_pages_range(mm, vma->vm_start, mm->task_size, &nmask,
>       724b:       80 39 00                cmpb   $0x0,(%rcx)
>       724e:       4d 63 c0                movslq %r8d,%r8
>       7251:       0f 85 9b 02 00 00       jne    74f2 <do_migrate_pages+0x682>
>       7257:       48 be 00 00 00 00 00    movabs $0xdffffc0000000000,%rsi
>       725e:       fc ff df
>       7261:       48 8b 4c 24 28          mov    0x28(%rsp),%rcx
>       7266:       48 8b 91 b0 00 00 00    mov    0xb0(%rcx),%rdx
>       726d:       48 89 c1                mov    %rax,%rcx
>       7270:       48 c1 e9 03             shr    $0x3,%rcx
>       7274:       80 3c 31 00             cmpb   $0x0,(%rcx,%rsi,1)
> 
> <--- we seem toc rash here
> 
>       7278:       0f 85 95 02 00 00       jne    7513 <do_migrate_pages+0x6a3>
>       727e:       48 8b 30                mov    (%rax),%rsi
>       7281:       49 89 d9                mov    %rbx,%r9
>       7284:       48 8b 4c 24 08          mov    0x8(%rsp),%rcx
>       7289:       48 8b 7c 24 28          mov    0x28(%rsp),%rdi
>       728e:       e8 8d 9a ff ff          call   d20 <queue_pages_range>
>       7293:       48 89 44 24 30          mov    %rax,0x30(%rsp)
>       7298:       e9 c4 00 00 00          jmp    7361 <do_migrate_pages+0x4f1>
>           up_read(&mm->mmap_lock);
>       729d:       e8 00 00 00 00          call   72a2 <do_migrate_pages+0x432>
>       72a2:       4c 89 ef                mov    %r13,%rdi
>       72a5:       e8 00 00 00 00          call   72aa <do_migrate_pages+0x43a>
> 
> 
> Which would be do_migrate_pages()->migrate_to_node():
> 
> mmap_read_lock(mm);
> vma = find_vma(mm, 0);
> nr_failed = queue_pages_range(mm, vma->vm_start, mm->task_size, &nmask,
> 			      flags | MPOL_MF_DISCONTIG_OK, &pagelist);
> mmap_read_unlock(mm);
> 
> ... and it seems to fail before calling queue_pages_range() :/
> 
> Did we, for some reason get a vma=NULL, because someone is concurrently tearing down the MM?

I think that's exactly what's happening. Will send a fix after testing it.

-- 
Cheers,

David / dhildenb