lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sat, 3 Apr 2021 17:00:42 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Muchun Song <songmuchun@...edance.com>
Cc:     Oscar Salvador <osalvador@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Roman Gushchin <guro@...com>, Michal Hocko <mhocko@...e.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        David Hildenbrand <david@...hat.com>,
        David Rientjes <rientjes@...gle.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Matthew Wilcox <willy@...radead.org>,
        HORIGUCHI NAOYA <naoya.horiguchi@....com>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.ibm.com>,
        Waiman Long <longman@...hat.com>, Peter Xu <peterx@...hat.com>,
        Mina Almasry <almasrymina@...gle.com>,
        Hillf Danton <hdanton@...a.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Barry Song <song.bao.hua@...ilicon.com>,
        Will Deacon <will@...nel.org>
Subject: Re: [External] [PATCH v3 7/8] hugetlb: make free_huge_page irq safe

On 4/2/21 10:59 PM, Muchun Song wrote:
> On Sat, Apr 3, 2021 at 4:56 AM Mike Kravetz <mike.kravetz@...cle.com> wrote:
>>
>> On 4/2/21 5:47 AM, Muchun Song wrote:
>>> On Wed, Mar 31, 2021 at 11:42 AM Mike Kravetz <mike.kravetz@...cle.com> wrote:
>>>>
>>>> Commit c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in
>>>> non-task context") was added to address the issue of free_huge_page
>>>> being called from irq context.  That commit hands off free_huge_page
>>>> processing to a workqueue if !in_task.  However, this doesn't cover
>>>> all the cases as pointed out by 0day bot lockdep report [1].
>>>>
>>>> :  Possible interrupt unsafe locking scenario:
>>>> :
>>>> :        CPU0                    CPU1
>>>> :        ----                    ----
>>>> :   lock(hugetlb_lock);
>>>> :                                local_irq_disable();
>>>> :                                lock(slock-AF_INET);
>>>> :                                lock(hugetlb_lock);
>>>> :   <Interrupt>
>>>> :     lock(slock-AF_INET);
>>>>
>>>> Shakeel has later explained that this is very likely TCP TX zerocopy
>>>> from hugetlb pages scenario when the networking code drops a last
>>>> reference to hugetlb page while having IRQ disabled. Hugetlb freeing
>>>> path doesn't disable IRQ while holding hugetlb_lock so a lock dependency
>>>> chain can lead to a deadlock.
>>>>
>>>> This commit addresses the issue by doing the following:
>>>> - Make hugetlb_lock irq safe.  This is mostly a simple process of
>>>>   changing spin_*lock calls to spin_*lock_irq* calls.
>>>> - Make subpool lock irq safe in a similar manner.
>>>> - Revert the !in_task check and workqueue handoff.
>>>>
>>>> [1] https://lore.kernel.org/linux-mm/000000000000f1c03b05bc43aadc@google.com/
>>>>
>>>> Signed-off-by: Mike Kravetz <mike.kravetz@...cle.com>
>>>> Acked-by: Michal Hocko <mhocko@...e.com>
>>>> Reviewed-by: Muchun Song <songmuchun@...edance.com>
>>>
>>> Hi Mike,
>>>
>>> Today I pulled the newest code (next-20210401). I found that
>>> alloc_and_dissolve_huge_page is not updated. In this function,
>>> hugetlb_lock is still non-irq safe. Maybe you or Oscar need
>>> to fix.
>>>
>>> Thanks.
>>
>> Thank you Muchun,
>>
>> Oscar's changes were not in Andrew's tree when I started on this series
>> and I failed to notice their inclusion.  In addition,
>> isolate_or_dissolve_huge_page also needs updating as well as a change in
>> set_max_huge_pages that was omitted while rebasing.
>>
>> Andrew, the following patch addresses those missing changes.  Ideally,
>> the changes should be combined/included in this patch.  If you want me
>> to sent another version of this patch or another series, let me know.
>>
>> From 450593eb3cea895f499ddc343c22424c552ea502 Mon Sep 17 00:00:00 2001
>> From: Mike Kravetz <mike.kravetz@...cle.com>
>> Date: Fri, 2 Apr 2021 13:18:13 -0700
>> Subject: [PATCH] hugetlb: fix irq locking omissions
>>
>> The pach "hugetlb: make free_huge_page irq safe" changed spin_*lock
>> calls to spin_*lock_irq* calls.  However, it missed several places
>> in the file hugetlb.c.  Add the overlooked changes.
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@...cle.com>
> 
> Thanks MIke. But there are still some places that need
> improvement. See below.
> 

Correct.  My apologies again for not fully taking into account the new
code from Oscar's series when working on this.

>> ---
>>  mm/hugetlb.c | 16 ++++++++--------
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index c22111f3da20..a6bfc6bcbc81 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -2284,7 +2284,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>          */
>>         page_ref_dec(new_page);
>>  retry:
>> -       spin_lock(&hugetlb_lock);
>> +       spin_lock_irq(&hugetlb_lock);
>>         if (!PageHuge(old_page)) {
>>                 /*
>>                  * Freed from under us. Drop new_page too.
>> @@ -2297,7 +2297,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
>>                  * Fail with -EBUSY if not possible.
>>                  */
>>                 update_and_free_page(h, new_page);
> 
> Now update_and_free_page can be called without holding
> hugetlb_lock. We can move it out of hugetlb_lock. In this
> function, there are 3 places which call update_and_free_page().
> We can move all of them out of hugetlb_lock. Right?

We will need to do more than that.
The call to update_and_free_page in alloc_and_dissolve_huge_page
assumes old functionality.  i.e. It assumes h->nr_huge_pages will be
decremented in update_and_free_page.  This is no longer the case.

This will need to be fixed in patch 4 of my series which changes the
functionality of update_and_free_page.  I'm afraid a change there will
end up requiring changes in subsequent patches due to context.

I will have an update on Monday.
-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ