lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf22aeef-0160-46f8-b2e3-d308ccee0504@arm.com>
Date: Tue, 27 May 2025 13:36:24 +0530
From: Dev Jain <dev.jain@....com>
To: David Hildenbrand <david@...hat.com>, Shivank Garg <shivankg@....com>,
 akpm@...ux-foundation.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: ziy@...dia.com, baolin.wang@...ux.alibaba.com,
 lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, npache@...hat.com,
 ryan.roberts@....com, fengwei.yin@...el.com, bharata@....com,
 syzbot+2b99589e33edbe9475ca@...kaller.appspotmail.com
Subject: Re: [PATCH V3 1/2] mm/khugepaged: fix race with folio split/free
 using temporary reference


On 27/05/25 1:18 pm, David Hildenbrand wrote:
> On 27.05.25 05:20, Dev Jain wrote:
>>
>> On 26/05/25 11:58 pm, Shivank Garg wrote:
>>> hpage_collapse_scan_file() calls is_refcount_suitable(), which in turn
>>> calls folio_mapcount(). folio_mapcount() checks folio_test_large() 
>>> before
>>> proceeding to folio_large_mapcount(), but there is a race window 
>>> where the
>>> folio may get split/freed between these checks, triggering:
>>>
>>>     VM_WARN_ON_FOLIO(!folio_test_large(folio), folio)
>>>
>>> Take a temporary reference to the folio in hpage_collapse_scan_file().
>>> This stabilizes the folio during refcount check and prevents incorrect
>>> large folio detection due to concurrent split/free. Use helper
>>> folio_expected_ref_count() + 1 to compare with folio_ref_count()
>>> instead of using is_refcount_suitable().
>>>
>>> Fixes: 05c5323b2a34 ("mm: track mapcount of large folios in single 
>>> value")
>>> Reported-by: syzbot+2b99589e33edbe9475ca@...kaller.appspotmail.com
>>> Closes: 
>>> https://lore.kernel.org/all/6828470d.a70a0220.38f255.000c.GAE@google.com 
>>>
>>> Suggested-by: David Hildenbrand <david@...hat.com>
>>> Acked-by: David Hildenbrand <david@...hat.com>
>>> Signed-off-by: Shivank Garg <shivankg@....com>
>>> ---
>>
>> The patch looks fine.
>>
>> I was just wondering about the implications of this on migration. 
>> Earlier
>> we had a refcount race between migration and shmem page fault via 
>> filemap_get_entry()
>> taking a reference and not releasing it till we take the folio lock, 
>> which was held
>> by the migration path. I would like to *think* that real workloads, 
>> when migrating
>> pages, will *not* be faulting on those pages simultaneously, just 
>> guessing. But now
>> we have a kernel thread (khugepaged) racing against migration. I may 
>> just be over-speculating.
>
> I'm not quite sure I understand the concern you have. Any temporary 
> reference can temporarily block migration, however, the retry logic 
> should be able to handle that just fine -- and this code is not really 
> special (see filemap_get_entry()).


You are correct that any temp ref can block migration, however, that 
reference has to come after the folios have been isolated in the 
migration path.

So the probability of someone taking a reference on the folio is quite 
low since it has been isolated. The problem with filemap_get_entry() is 
that it finds

the folio in the pagecache, so isolation is useless, then takes a 
reference and then shmem_get_folio_gfp() does a folio_lock() instead of 
folio_try_lock().

This was the race which I talked about an year back at [1]. My concern 
is that we are adding another candidate to that race; just wondering if 
there is

a better solution to fix the race mentioned in Shivank's patchset.


[1] https://lore.kernel.org/all/20240801081657.1386743-1-dev.jain@arm.com/


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ