linux-kernel - Re: [syzbot] [mm?] KCSAN: data-race in mprotect_fixup / try_to_migrate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAG48ez2sSCkrv2L3ZHjHdPf0hLeyD=4K7ab5icpOF5MXp4MrCw@mail.gmail.com>
Date: Wed, 5 Feb 2025 16:14:52 +0100
From: Jann Horn <jannh@...gle.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: syzbot <syzbot+c2e5712cbb14c95d4847@...kaller.appspotmail.com>, 
	Liam.Howlett@...cle.com, akpm@...ux-foundation.org, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	syzkaller-bugs@...glegroups.com, vbabka@...e.cz
Subject: Re: [syzbot] [mm?] KCSAN: data-race in mprotect_fixup / try_to_migrate_one

On Wed, Feb 5, 2025 at 4:11 PM Lorenzo Stoakes
<lorenzo.stoakes@...cle.com> wrote:
> On Wed, Feb 05, 2025 at 04:00:06PM +0100, Jann Horn wrote:
> > On Wed, Feb 5, 2025 at 12:41 PM syzbot
> > <syzbot+c2e5712cbb14c95d4847@...kaller.appspotmail.com> wrote:
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    d009de7d5428 Merge tag 'livepatching-for-6.14-rc2' of git:..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=12b678a4580000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=9e757e3762bd630b
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=c2e5712cbb14c95d4847
> > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/9235000a1b88/disk-d009de7d.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/098ef82f8ab3/vmlinux-d009de7d.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/4f51f5eb5782/bzImage-d009de7d.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+c2e5712cbb14c95d4847@...kaller.appspotmail.com
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in mprotect_fixup / try_to_migrate_one
> > >
> > > write to 0xffff888114b41700 of 8 bytes by task 6432 on cpu 1:
> > >  vm_flags_init include/linux/mm.h:875 [inline]
> > >  vm_flags_reset include/linux/mm.h:887 [inline]
> > >  mprotect_fixup+0x419/0x5e0 mm/mprotect.c:679
> > >  do_mprotect_pkey+0x6cc/0x9a0 mm/mprotect.c:840
> >
> > This is one side changing the VMA flags under the mmap lock in write mode...
> >
> > >  __do_sys_mprotect mm/mprotect.c:861 [inline]
> > >  __se_sys_mprotect mm/mprotect.c:858 [inline]
> > >  __x64_sys_mprotect+0x48/0x60 mm/mprotect.c:858
> > >  x64_sys_call+0x2770/0x2dc0 arch/x86/include/generated/asm/syscalls_64.h:11
> > >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > >  do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
> > >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > >
> > > read to 0xffff888114b41700 of 8 bytes by task 6418 on cpu 0:
> > >  try_to_migrate_one+0xb5a/0x12e0 mm/rmap.c:2321
> > >  rmap_walk_anon+0x28f/0x440 mm/rmap.c:2646
> >
> > ... while the other side comes through the rmap, which does not
> > involve the mmap lock. Yes, that does not have any mutual locking by
> > design, I think.
> >
> > The comments in the VMA flags code incorrectly assume that no
> > concurrency is possible here; and I think the comment in
> > mprotect_fixup() about protection by the mmap_lock has also been kinda
> > wrong since the beginning of git history.
> >
> > The VM_LOCKED check in the migration code was added by Hugh in commit
> > b74355078b655, but that's just one example syzbot stumbled over; we
> > have similar racy vm_flags reads through the rmap on other paths like:
> >
> > unmap_mapping_range_tree -> unmap_mapping_range_vma ->
> > zap_page_range_single -> unmap_single_vma -> unmap_page_range -> ...
> > -> zap_pte_range -> zap_present_ptes -> vm_normal_page
> >
> > I think the right fix might just be to make sure that we use
> > WRITE_ONCE() for these vm_flags updates, and READ_ONCE() around
> > ->vm_flags reads that can happen in rmap walk paths, though we should
> > think about the consequences of concurrently changing flags in every
> > place that gets a READ_ONCE()...
>
> Yup cool similar to my thread on this.
>
> I hate that we have these landmines waiting for us. Be good to find a way
> to explicitly annotate this, or at least comment somehow.
>
> But agreed, probably adding a READ_ONCE()/WRITE_ONCE() is appropriate at
> least for the proximate thing.
>
> It's a wonder these things don't trigger more, except you need probably
> very precise timing to do it...
>
> I can do a quick cheeky patch.

Thanks!