lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <492b58a8-ff4a-4afe-b317-6fd1bafc874e@igalia.com>
Date: Thu, 17 Apr 2025 20:38:30 +0800
From: Gavin Guo <gavinguo@...lia.com>
To: Zi Yan <ziy@...dia.com>
Cc: David Hildenbrand <david@...hat.com>, Hugh Dickins <hughd@...gle.com>,
 linux-mm@...ck.org, akpm@...ux-foundation.org, willy@...radead.org,
 linmiaohe@...wei.com, revest@...gle.com, kernel-dev@...lia.com,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/huge_memory: fix dereferencing invalid pmd migration
 entry

On 4/17/25 20:10, Zi Yan wrote:
> On 17 Apr 2025, at 8:02, Gavin Guo wrote:
> 
>> On 4/17/25 19:32, Zi Yan wrote:
>>> On 17 Apr 2025, at 7:21, Gavin Guo wrote:
>>>
>>>> On 4/17/25 17:04, David Hildenbrand wrote:
>>>>> On 17.04.25 10:55, Hugh Dickins wrote:
>>>>>> On Thu, 17 Apr 2025, David Hildenbrand wrote:
>>>>>>> On 17.04.25 09:18, David Hildenbrand wrote:
>>>>>>>> On 17.04.25 07:36, Hugh Dickins wrote:
>>>>>>>>> On Wed, 16 Apr 2025, David Hildenbrand wrote:
>>>>>>>>>>
>>>>>>>>>> Why not something like
>>>>>>>>>>
>>>>>>>>>> struct folio *entry_folio;
>>>>>>>>>>
>>>>>>>>>> if (folio) {
>>>>>>>>>>     if (is_pmd_migration_entry(*pmd))
>>>>>>>>>>         entry_folio = pfn_swap_entry_folio(pmd_to_swp_entry(*pmd)));
>>>>>>>>>>     else
>>>>>>>>>>      entry_folio = pmd_folio(*pmd));
>>>>>>>>>>
>>>>>>>>>>     if (folio != entry_folio)
>>>>>>>>>>           return;
>>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> My own preference is to not add unnecessary code:
>>>>>>>>> if folio and pmd_migration entry, we're not interested in entry_folio.
>>>>>>>>> But yes it could be written in lots of other ways.
>>>>>>>>
>>>>>>>> While I don't disagree about "not adding unnecessary code" in general,
>>>>>>>> in this particular case just looking the folio up properly might be the
>>>>>>>> better alternative to reasoning about locking rules with conditional
>>>>>>>> input parameters :)
>>>>>>>>
>>>>>>>
>>>>>>> FWIW, I was wondering if we can rework that code, letting the caller to the
>>>>>>> checking and getting rid of the folio parameter. Something like this
>>>>>>> (incomplete, just to
>>>>>>> discuss if we could move the TTU_SPLIT_HUGE_PMD handling).
>>>>>>
>>>>>> Yes, I too dislike the folio parameter used for a single case, and agree
>>>>>> it's better for the caller who chose pmd to check that *pmd fits the folio.
>>>>>>
>>>>>> I haven't checked your code below, but it looks like a much better way
>>>>>> to proceed, using the page_vma_mapped_walk() to get pmd lock and check;
>>>>>> and cutting out two or more layers of split_huge_pmd obscurity.
>>>>>>
>>>>>> Way to go.  However... what we want right now is a fix that can easily
>>>>>> go to stable: the rearrangements here in 6.15-rc mean, I think, that
>>>>>> whatever goes into the current tree will have to be placed differently
>>>>>> for stable, no seamless backports; but Gavin's patch (reworked if you
>>>>>> insist) can be adapted to stable (differently for different releases)
>>>>>> more more easily than the future direction you're proposing here.
>>>>>
>>>>> I'm fine with going with the current patch and looking into cleaning it up properly (if possible).
>>>>>
>>>>> So for this patch
>>>>>
>>>>> Acked-by: David Hildenbrand <david@...hat.com>
>>>>>
>>>>> @Gavin, can you look into cleaning that up?
>>>>
>>>> Thank you for your review. Before I begin the cleanup, could you please
>>>> confirm the following action items:
>>>>
>>>> Zi Yan's suggestions for the patch are:
>>>> 1. Replace the page fault with an invalid address access in the commit
>>>>      description.
>>>>
>>>> 2. Simplify the nested if-statements into a single if-statement to
>>>>      reduce indentation.
>>>
>>> 3. Can you please add Huge’s explanation below in the commit log?
>>> That clarifies the issue. Thank you for the fix.
>>
>> Sure, will send out another patch for your review. Thank you for the review.
>>
> Thanks. Do you mind sharing the syzkaller reproducer if that is
> possible and easy? I am trying to understand more about the issue.

Sure, this is the reproducer:
https://drive.google.com/file/d/1eDBV6VfIzyqD9SeYGQBah-BJXO32Piy8/view

Reproducing steps
1). gcc -o repro -lpthread -static ./repro.c

2). ./repro

3). Find the group number and replace 2539 in the following
sudo cat /sys/kernel/debug/shrinker/thp-deferred_split-12/count

4). Run the following command in multiple sessions
for i in $(seq 10000); do echo "2539 0 100" | sudo tee 
/sys/kernel/debug/shrinker/thp-deferred_split-12/scan ; done

Generally, the bug will be triggered within 5 minutes.

> 
>>>
>>> “
>>> an anon_vma lookup points to a
>>> location which may contain the folio of interest, but might instead
>>> contain another folio: and weeding out those other folios is precisely
>>> what the "folio != pmd_folio((*pmd)" check (and the "risk of replacing
>>> the wrong folio" comment a few lines above it) is for.
>>> ”
>>>
>>> With that, Acked-by: Zi Yan <ziy@...dia.com>
>>>
>>>>
>>>> David, based on your comment, I understand that you are recommending the
>>>> entry_folio implementation. Also, from your discussion with Hugh, it
>>>> appears you agreed with my original approach of returning early when
>>>> encountering a PMD migration entry, thereby avoiding unnecessary checks.
>>>> Is that correct? If so, I will keep the current logic. Do you have any
>>>> additional cleanup suggestions?
>>>>
>>>> I will start the cleanup work after confirmation.
>>>>
>>>>>
>>>>>>
>>>>>> (Hmm, that may be another reason for preferring the reasoning by
>>>>>> folio lock: forgive me if I'm misremembering, but didn't those
>>>>>> page migration swapops get renamed, some time around 5.11?)
>>>>>
>>>>> I remember that we did something to PTE handling stuff in the context of PTE markers. But things keep changing all of the time .. :)
>>>>>
>>>
>>>
>>> Best Regards,
>>> Yan, Zi
> 
> 
> Best Regards,
> Yan, Zi


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ