[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37e77dc4-bc73-2710-088f-f7ec0c787caf@huawei.com>
Date: Thu, 5 Jun 2025 15:37:53 +0800
From: Jinjiang Tu <tujinjiang@...wei.com>
To: David Hildenbrand <david@...hat.com>, syzbot
<syzbot+3b220254df55d8ca8a61@...kaller.appspotmail.com>,
<Liam.Howlett@...cle.com>, <akpm@...ux-foundation.org>,
<harry.yoo@...cle.com>, <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<lorenzo.stoakes@...cle.com>, <riel@...riel.com>,
<syzkaller-bugs@...glegroups.com>, <vbabka@...e.cz>, Jens Axboe
<axboe@...nel.dk>, Catalin Marinas <catalin.marinas@....com>
Subject: Re: [syzbot] [mm?] kernel BUG in try_to_unmap_one (2)
在 2025/6/5 14:37, David Hildenbrand 写道:
> On 05.06.25 08:27, David Hildenbrand wrote:
>> On 05.06.25 08:11, David Hildenbrand wrote:
>>> On 05.06.25 07:38, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit: d7fa1af5b33e Merge branch 'for-next/core' into
>>>> for-kernelci
>>>
>>> Hmmm, another very odd page-table mapping related problem on that tree
>>> found on arm64 only:
>>
>> In this particular reproducer we seem to be having MADV_HUGEPAGE and
>> io_uring_setup() be racing with MADV_HWPOISON, MADV_PAGEOUT and
>> io_uring_register(IORING_REGISTER_BUFFERS).
>>
>> I assume the issue is related to MADV_HWPOISON, MADV_PAGEOUT and
>> io_uring_register racing, only. I suspect MADV_HWPOISON is trying to
>> split a THP, while MADV_PAGEOUT tries paging it out.
>>
>> IORING_REGISTER_BUFFERS ends up in
>> io_sqe_buffers_register->io_sqe_buffer_register where we GUP-fast and
>> try coalescing buffers.
>>
>> And something about THPs is not particularly happy :)
>>
>
> Not sure if realted to io_uring.
>
> unmap_poisoned_folio() calls try_to_unmap() without TTU_SPLIT_HUGE_PMD.
>
> When called from memory_failure(), we make sure to never call it on a
> large folio: WARN_ON(folio_test_large(folio));
>
> However, from shrink_folio_list() we might call unmap_poisoned_folio()
> on a large folio, which doesn't work if it is still PMD-mapped. Maybe
> passing TTU_SPLIT_HUGE_PMD would fix it.
>
>
> Likely the relevant commit is:
>
> commit 1b0449544c6482179ac84530b61fc192a6527bfd
> Author: Jinjiang Tu <tujinjiang@...wei.com>
> Date: Tue Mar 18 16:39:39 2025 +0800
>
> mm/vmscan: don't try to reclaim hwpoison folio
> Syzkaller reports a bug as follows:
> Injecting memory failure for pfn 0x18b00e at process virtual
> address 0x20ffd000
> Memory failure: 0x18b00e: dirty swapcache page still referenced by
> 2 users
> Memory failure: 0x18b00e: recovery action for dirty swapcache
> page: Failed
> page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd
> pfn:0x18b00e
> memcg:ffff0000dd6d9000
> anon flags:
> 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
> raw: 005ffffe00482011 dead000000000100 dead000000000122
> ffff0000e232a7c9
> raw: 0000000000020ffd 0000000000000000 00000002ffffffff
> ffff0000dd6d9000
> page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
>
> CCing Jinjiang Tu
By the way, unmap_poisoned_folio() is called in do_migrate_range() too. the folio may be in lru and is a large folio.
Powered by blists - more mailing lists