lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <682c1745.a00a0220.7a43a.0084.GAE@google.com>
Date: Mon, 19 May 2025 22:46:45 -0700
From: syzbot <syzbot+2b99589e33edbe9475ca@...kaller.appspotmail.com>
To: shivankg@....com
Cc: akpm@...ux-foundation.org, baolin.wang@...ux.alibaba.com, david@...hat.com, 
	dev.jain@....com, liam.howlett@...cle.com, linux-kernel@...r.kernel.org, 
	linux-mm@...ck.org, lorenzo.stoakes@...cle.com, npache@...hat.com, 
	ryan.roberts@....com, shivankg@....com, syzkaller-bugs@...glegroups.com, 
	willy@...radead.org, ziy@...dia.com
Subject: Re: [syzbot] [mm?] WARNING in folio_large_mapcount

> On 5/19/2025 6:56 PM, David Hildenbrand wrote:
>> On 17.05.25 10:21, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:    627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
>>> compiler:       Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>
>>> Downloadable assets:
>>> disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
>>> vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
>>> kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+2b99589e33edbe9475ca@...kaller.appspotmail.com
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>> 
>> This should be
>> 
>> VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
>> 
>>> Modules linked in:
>>> CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
>>> RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>> Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
>>> RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
>>> RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
>>> RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
>>> RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
>>> R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
>>> R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
>>> FS:  0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>>   <TASK>
>>>   folio_mapcount include/linux/mm.h:1369 [inline]
>> 
>> And here we come through
>> 
>> if (likely(!folio_test_large(folio))) {
>>     ...
>> }
>> return folio_large_mapcount(folio);
>> 
>> 
>> So the folio is split concurrently. And I think there is nothing stopping it from getting freed.
>> 
>> We do a xas_for_each() under RCU. So yes, this is racy.
>> 
>> In  collapse_file(), we re-validate everything.
>> 
>> We could
>> 
>> (A) Take proper pagecache locks
>> 
>> (B) Try grabbing a temporary folio reference
>> 
>> (C) Try snapshotting the folio
>> 
>> Probably, in this code, (B) might be cleanest for now? Handling it just like other code in mm/filemap.c.
>> 
>
> Hi,
>
> I've implemented your suggestion (B) using folio_try_get().
> Could you please review if my patch looks correct?
>
> Tested it using existing selftests: sudo make -C tools/testing/selftests/mm run_tests
>
> Other two instances of is_refcount_suitable() uses folio locking. Should we maintain
> consistency with those?
>
> Thanks,
> Shivank
>
> #syz test

This crash does not have a reproducer. I cannot test it.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ