[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9a43f6d-1865-4074-b91c-a5bd7e10f2a9@redhat.com>
Date: Thu, 5 Jun 2025 08:37:19 +0200
From: David Hildenbrand <david@...hat.com>
To: syzbot <syzbot+3b220254df55d8ca8a61@...kaller.appspotmail.com>,
Liam.Howlett@...cle.com, akpm@...ux-foundation.org, harry.yoo@...cle.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
lorenzo.stoakes@...cle.com, riel@...riel.com,
syzkaller-bugs@...glegroups.com, vbabka@...e.cz, Jens Axboe
<axboe@...nel.dk>, Catalin Marinas <catalin.marinas@....com>,
Jinjiang Tu <tujinjiang@...wei.com>
Subject: Re: [syzbot] [mm?] kernel BUG in try_to_unmap_one (2)
On 05.06.25 08:27, David Hildenbrand wrote:
> On 05.06.25 08:11, David Hildenbrand wrote:
>> On 05.06.25 07:38, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit: d7fa1af5b33e Merge branch 'for-next/core' into for-kernelci
>>
>> Hmmm, another very odd page-table mapping related problem on that tree
>> found on arm64 only:
>
> In this particular reproducer we seem to be having MADV_HUGEPAGE and
> io_uring_setup() be racing with MADV_HWPOISON, MADV_PAGEOUT and
> io_uring_register(IORING_REGISTER_BUFFERS).
>
> I assume the issue is related to MADV_HWPOISON, MADV_PAGEOUT and
> io_uring_register racing, only. I suspect MADV_HWPOISON is trying to
> split a THP, while MADV_PAGEOUT tries paging it out.
>
> IORING_REGISTER_BUFFERS ends up in
> io_sqe_buffers_register->io_sqe_buffer_register where we GUP-fast and
> try coalescing buffers.
>
> And something about THPs is not particularly happy :)
>
Not sure if realted to io_uring.
unmap_poisoned_folio() calls try_to_unmap() without TTU_SPLIT_HUGE_PMD.
When called from memory_failure(), we make sure to never call it on a large folio: WARN_ON(folio_test_large(folio));
However, from shrink_folio_list() we might call unmap_poisoned_folio() on a large folio, which doesn't work if it is still PMD-mapped. Maybe passing TTU_SPLIT_HUGE_PMD would fix it.
Likely the relevant commit is:
commit 1b0449544c6482179ac84530b61fc192a6527bfd
Author: Jinjiang Tu <tujinjiang@...wei.com>
Date: Tue Mar 18 16:39:39 2025 +0800
mm/vmscan: don't try to reclaim hwpoison folio
Syzkaller reports a bug as follows:
Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000
Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users
Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed
page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e
memcg:ffff0000dd6d9000
anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9
raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000
page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
CCing Jinjiang Tu
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists