lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7eeb2a80-a73f-42b9-9f69-e2fe9fb824e5@redhat.com>
Date: Thu, 2 Oct 2025 09:25:34 +0200
From: David Hildenbrand <david@...hat.com>
To: jane.chu@...cle.com,
 syzbot <syzbot+e6367ea2fdab6ed46056@...kaller.appspotmail.com>,
 akpm@...ux-foundation.org, kernel@...kajraghav.com, linmiaohe@...wei.com,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, mcgrof@...nel.org,
 nao.horiguchi@...il.com, syzkaller-bugs@...glegroups.com, ziy@...dia.com
Subject: Re: [syzbot] [mm?] WARNING in memory_failure

On 02.10.25 01:58, jane.chu@...cle.com wrote:
> Hi, Zi Yan,
> 
> On 9/30/2025 9:51 PM, syzbot wrote:
>> Hello,
>>
>> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
>> lost connection to test machine
>>
>>
>>
>> Tested on:
>>
>> commit:         d8795075 mm/huge_memory: do not change split_huge_page..
>> git tree:       https://github.com/x-y-z/linux-dev.git fix_split_page_min_order-for-kernelci
>> console output: https://syzkaller.appspot.com/x/log.txt?x=17ce96e2580000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=714d45b6135c308e
>> dashboard link: https://syzkaller.appspot.com/bug?extid=e6367ea2fdab6ed46056
>> compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
>> userspace arch: arm64
>>
>> Note: no patches were applied.
>>
> 
> My hunch is that
>      https://github.com/x-y-z/linux-dev.git
> fix_split_page_min_order-for-kernelci
> alone is not enough.  Perhaps on ARM64, the page cache pages of
> /dev/nullb0 in the test case are probably with min_order > 0, therefore
> THP split fails, as the console message show:
> [  200.378989][T18221] Memory failure: 0x124d30: recovery action for
> unsplit thp: Failed
> 
> With lots of poisoned THP pages stuck in the page cache, OOM could
> trigger too soon.
> 
> I think it's worth to try add the additional changes I suggested earlier -
> https://lore.kernel.org/lkml/7577871f-06be-492d-b6d7-8404d7a045e0@oracle.com/

I think that makes sense in this case. I earlier said that I don't think 
even splitting makes sense in this case, but as you say we can actually 
at least allow for reclaiming the remainder of the folio.

Even though we cannot proceed in handling the remaining large folio 
later on.

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ