lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f9817bdc-2554-db50-9c72-e629ad3ea12b@huawei.com>
Date:   Wed, 24 Mar 2021 10:14:59 +0800
From:   Miaohe Lin <linmiaohe@...wei.com>
To:     Yang Shi <shy828301@...il.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Jerome Glisse <jglisse@...hat.com>,
        Rafael Aquini <aquini@...hat.com>,
        David Hildenbrand <david@...hat.com>,
        Alistair Popple <apopple@...dia.com>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>
Subject: Re: [PATCH v2 5/5] mm/migrate.c: fix potential deadlock in NUMA
 balancing shared exec THP case

On 2021/3/24 9:16, Yang Shi wrote:
> On Tue, Mar 23, 2021 at 10:17 AM Yang Shi <shy828301@...il.com> wrote:
>>
>> On Tue, Mar 23, 2021 at 6:55 AM Miaohe Lin <linmiaohe@...wei.com> wrote:
>>>
>>> Since commit c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA
>>> balancing"), the NUMA balancing would skip shared exec transhuge page.
>>> But this enhancement is not suitable for transhuge page. Because it's
>>> required that page_mapcount() must be 1 due to no migration pte dance
>>> is done here. On the other hand, the shared exec transhuge page will
>>> leave the migrate_misplaced_page() with pte entry untouched and page
>>> locked. Thus pagefault for NUMA will be triggered again and deadlock
>>> occurs when we start waiting for the page lock held by ourselves.
>>
>> Thanks for catching this. By relooking the code I think the other
>> important reason for removing this is
>> migrate_misplaced_transhuge_page() actually can't see shared exec file
>> THP at all since page_lock_anon_vma_read() is called before and if
>> page is not anonymous page it will just restore the PMD without
>> migrating anything.
>>
>> The pages for private mapped file vma may be anonymous pages due to
>> COW but they can't be THP so it won't trigger THP numa fault at all. I
>> think this is why no bug was reported. I overlooked this in the first
>> place.
>>
>> Your fix is correct, and please add the above justification to your commit log.
> 
> BTW, I think you can just undo or revert commit c77c5cbafe54 ("mm:
> migrate: skip shared exec THP for NUMA balancing").
> 

Yep, we can revert this commit. I thought it handle the shared exec base page too.
Will do it and with the above justification to the commit log.
Many Thanks!

> Thanks,
> Yang
> 
>>
>> Reviewed-by: Yang Shi <shy828301@...il.com>
>>
>>>
>>> Fixes: c77c5cbafe54 ("mm: migrate: skip shared exec THP for NUMA balancing")
>>> Signed-off-by: Miaohe Lin <linmiaohe@...wei.com>
>>> ---
>>>  mm/migrate.c | 4 ----
>>>  1 file changed, 4 deletions(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 5357a8527ca2..68bfa1625898 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -2192,9 +2192,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>>>         int page_lru = page_is_file_lru(page);
>>>         unsigned long start = address & HPAGE_PMD_MASK;
>>>
>>> -       if (is_shared_exec_page(vma, page))
>>> -               goto out;
>>> -
>>>         new_page = alloc_pages_node(node,
>>>                 (GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
>>>                 HPAGE_PMD_ORDER);
>>> @@ -2306,7 +2303,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
>>>
>>>  out_unlock:
>>>         unlock_page(page);
>>> -out:
>>>         put_page(page);
>>>         return 0;
>>>  }
>>> --
>>> 2.19.1
>>>
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ