[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c283ef8d-0816-4e49-849c-296bc32195cf@redhat.com>
Date: Sun, 13 Apr 2025 22:45:09 +0200
From: David Hildenbrand <david@...hat.com>
To: syzbot <syzbot+5e8feb543ca8e12e0ede@...kaller.appspotmail.com>,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [mm?] WARNING in do_wp_page
On 13.04.25 22:20, David Hildenbrand wrote:
> On 12.04.25 20:46, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>
> Related to my recent changes
>
>>
>> HEAD commit: 0af2f6be1b42 Linux 6.15-rc1
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1766323f980000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=f175b153b655dbb3
>
> CONFIG_ARCH_WANTS_THP_SWAP=y
> CONFIG_MM_ID=y
> CONFIG_TRANSPARENT_HUGEPAGE=y
> # CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
> CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
> # CONFIG_TRANSPARENT_HUGEPAGE_NEVER is not set
> CONFIG_THP_SWAP=y
> CONFIG_READ_ONLY_THP_FOR_FS=y
> # CONFIG_NO_PAGE_MAPCOUNT is not set
> CONFIG_PAGE_MAPCOUNT=y
>
>> dashboard link: https://syzkaller.appspot.com/bug?extid=5e8feb543ca8e12e0ede
>> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/f1d71d1bf77d/disk-0af2f6be.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/7f1638f065da/vmlinux-0af2f6be.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/9b3e49834705/bzImage-0af2f6be.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+5e8feb543ca8e12e0ede@...kaller.appspotmail.com
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 7165 at mm/memory.c:3738 __wp_can_reuse_large_anon_folio mm/memory.c:3738 [inline]
>
> VM_WARN_ON_ONCE(folio_entire_mapcount(folio));
>
> Which is rather unexpected. I know we had a scenario (remapping a THP?)
> where we would have a PMD mapping and a PTE mapping of an exclusive anon
> folio for a very short time. But, IIRC locking should make sure that
> that cannot be observed by some other page table walker.
>
Ah, it likely is a (harless) race, when process A and process B
cow-share a PMD THP, and process A write-faults on a PTE mapping of the
THP while process B concurrently unmaps the PMD mapping of the THP.
In __folio_remove_rmap(), for RMAP_LEVEL_PMD in case of
CONFIG_PAGE_MAPCOUNT=y, we'll do
folio_dec_large_mapcount(folio, vma);
last = atomic_add_negative(-1, &folio->_entire_mapcount);
So after decrementing the large mapcount, the folio will be indicated as
"exclusive" to process A.
Process B, still has to decrement the entire mapcount, but process A
might already run into the entire_mapcount sanity check.
In do_wp_page(), we'd later fail the "folio_large_mapcount(folio) !=
folio_ref_count(folio)" test until process B is completely done with
unmapping the folio.
Maybe we should just move these sanity checks after the refcount check,
or reverse the mapcount decrement order. I'll think about that.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists