lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32a7c9a6-0844-4dd1-b092-badca305546a@meta.com>
Date: Wed, 28 Jan 2026 14:20:14 -0500
From: Chris Mason <clm@...a.com>
To: Kairui Song <ryncsn@...il.com>, linux-mm@...ck.org,
        Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: Hugh Dickins <hughd@...gle.com>,
        Andrew Morton
 <akpm@...ux-foundation.org>,
        Kemeng Shi <shikemeng@...weicloud.com>, Nhat Pham <nphamcs@...il.com>,
        Chris Li <chrisl@...nel.org>, Baoquan He <bhe@...hat.com>,
        Barry Song <baohua@...nel.org>, linux-kernel@...r.kernel.org,
        Kairui Song <kasong@...cent.com>, stable@...r.kernel.org
Subject: Re: [PATCH v3] mm/shmem, swap: fix race of truncate and swap entry
 split

On 1/28/26 11:52 AM, Kairui Song wrote:
> On Wed, Jan 28, 2026 at 05:02:34AM +0800, Chris Mason wrote:
>>
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>
>> [ ... ]
>>
>>> @@ -1191,12 +1203,23 @@
>>>  		folio = fbatch.folios[i];
>>>
>>>  		if (xa_is_value(folio)) {
>>> +			int order;
>>>  			long swaps_freed;
>>>
>>>  			if (unfalloc)
>>>  				continue;
>>> -			swaps_freed = shmem_free_swap(mapping, indices[i], folio);
>>> +			swaps_freed = shmem_free_swap(mapping, indices[i],
>>> +						      end - 1, folio);
>>>  			if (!swaps_freed) {
>>> +				/*
>>> +				 * If found a large swap entry cross the end border,
>>> +				 * skip it as the truncate_inode_partial_folio above
>>> +				 * should have at least zerod its content once.
>>> +				 */
>>> +				order = shmem_confirm_swap(mapping, indices[i],
>>> +							   radix_to_swp_entry(folio));
>>> +				if (order > 0 && indices[i] + (1 << order) > end)
>>> +					continue;
>>>  				/* Swap was replaced by page: retry */
>>>  				index = indices[i];
>>>  				break;
>>
>> Could this code infinite loop if a large swap entry crosses the START
>> boundary instead of the END boundary?
> 
> Thanks for the review! It's really helpful :)
> 
>> When shmem_free_swap() returns 0, it could be either because:
>> (a) The entry was replaced by a page, or
>> (b) The entry failed the boundary check (base < index ||
>>     base + nr_pages - 1 > end)
>>
>> The boundary check here only handles case (b) for entries crossing the
>> END boundary:
>>
>>     if (order > 0 && indices[i] + (1 << order) > end)
>>         continue;
>>
>> But what happens if the entry crosses the START boundary? If
>> find_get_entries() returns a large swap entry at indices[i] where
>> the entry's base (calculated as indices[i] & ~((1 << order) - 1)) is
>> less than the truncation start point, then shmem_free_swap() will
>> return 0 due to the "base < index" check. The code will then call
>> shmem_confirm_swap(), get the order, check if it crosses the END
>> boundary (which it doesn't), and retry with the same index:
>>
>>     index = indices[i];
>>     break;
>>
>> The next iteration will find the same entry again at the same index,
>> leading to an infinite loop. For example:
>>
>> - Truncating range [18, 30]
>> - Large swap entry at [16, 23] (order 3, 8 pages)
>> - indices[i] = 18
>> - shmem_free_swap() sees base=16 < index=18, returns 0
>> - Check: 18 + 8 > 30 is false (26 <= 30)
>> - Retries with index=18
>> - Loop repeats indefinitely
> 
> I think this is a valid issue. And it's worse than that, during the `while (index < end)` loop a new large entry can land anywhere in the range, if one interaction's starting `index` points to the middle of any large entry, an infinite loop will occur: indices[0] are always equal to the `index` iteration value of that moments, shmem_free_swap will fail because the swap entry's index doesn't match indices[0], and so the `index = indices[i]; break;` keep it loop forever.
> 
> The chance seems very low though.
> 
>> Should the boundary check also handle the START case, perhaps:
>>
>>     if (order > 0) {
>>         pgoff_t base = indices[i] & ~((1UL << order) - 1);
>>         if (base + (1 << order) - 1 > end || base < start)
>>             continue;
>>     }
> 
> This still doesn't cover the case when a new large entry somehow lands in the range during the loop.

FWIW, I'd have been really surprised if claude actually fixed the
bug...I haven't stopped the prompt from making the suggestions just
because I think it helps explain the potential issue.

> 
>> where 'start' is preserved from before the loop?
> 
> How about following patch:
> 
> From 863f38c757ee0898b6b7f0f8c695f551a1380ce8 Mon Sep 17 00:00:00 2001
> From: Kairui Song <kasong@...cent.com>
> Date: Thu, 29 Jan 2026 00:19:23 +0800
> Subject: [PATCH] mm, shmem: prevent infinite loop on truncate race
> 

I ran this incremental through and it didn't flag any issues, thanks!

-chris


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ