linux-kernel - Re: [syzbot] [sound?] kernel BUG in filemap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7e338491-0c6b-4b65-93b7-df0af8b2fd87@redhat.com>
Date: Wed, 17 Sep 2025 09:57:19 +0200
From: David Hildenbrand <david@...hat.com>
To: Jan Kara <jack@...e.cz>, Ryan Roberts <ryan.roberts@....com>
Cc: syzbot <syzbot+263f159eb37a1c4c67a4@...kaller.appspotmail.com>,
 akpm@...ux-foundation.org, chaitanyas.prakash@....com, davem@...emloft.net,
 edumazet@...gle.com, hdanton@...a.com, horms@...nel.org, kuba@...nel.org,
 kuniyu@...gle.com, linux-kernel@...r.kernel.org,
 linux-sound@...r.kernel.org, netdev@...r.kernel.org, pabeni@...hat.com,
 perex@...ex.cz, syzkaller-bugs@...glegroups.com, tiwai@...e.com,
 willemb@...gle.com
Subject: Re: [syzbot] [sound?] kernel BUG in filemap_fault (2)

On 16.09.25 15:05, Jan Kara wrote:
> On Tue 16-09-25 13:50:08, Ryan Roberts wrote:
>> On 14/09/2025 11:51, syzbot wrote:
>>> syzbot suspects this issue was fixed by commit:
>>>
>>> commit bdb86f6b87633cc020f8225ae09d336da7826724
>>> Author: Ryan Roberts <ryan.roberts@....com>
>>> Date:   Mon Jun 9 09:27:23 2025 +0000
>>>
>>>      mm/readahead: honour new_order in page_cache_ra_order()
>>
>> I'm not sure what original bug you are claiming this is fixing? Perhaps this?
>>
>> https://lore.kernel.org/linux-mm/6852b77e.a70a0220.79d0a.0214.GAE@google.com/
> 
> I think it was:
> 
> https://lore.kernel.org/all/684ffc59.a00a0220.279073.0037.GAE@google.com/
> 
> at least that's what the syzbot email replies to... And it doesn't make a
> lot of sense but it isn't totally off either. So I'd just let the syzbot
> bug autoclose after some timeout.

Hm, in the issue we ran into was:

	VM_BUG_ON_FOLIO(!folio_contains(folio, index), folio);

in filemap_fault().

Now, that sounds rather bad, especially given that it was reported upstream.

So likely we should figure out what happened and see if it really fixed 
it and if so, why it fixed it (stable backports etc)?

Could be that Ryans patch is just making the problem harder to 
reproduce, of course (what I assume right now).


Essentially we do a

	folio = filemap_get_folio(mapping, index);

followed by

	if (!lock_folio_maybe_drop_mmap(vmf, folio, &fpin))
		goto out_retry;

	/* Did it get truncated? */
	if (unlikely(folio->mapping != mapping)) {
		folio_unlock(folio);
		folio_put(folio);
		goto retry_find;
	}
	VM_BUG_ON_FOLIO(!folio_contains(folio, index), folio);


I would assume that if !folio_contains(folio, index), either the folio 
got split in the meantime (filemap_get_folio() returned with a raised 
reference, though) or that file pagecache contained something wrong.


In __filemap_get_folio() we perform the same checks after locking the 
folio (with FGP_LOCK), and weird enough it didn't trigger yet there.

-- 
Cheers

David / dhildenb