linux-kernel - Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to remove_pool_huge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <fd723ea8-da7c-bd59-8d8a-e506be1b3af5@oracle.com>
Date:   Mon, 22 Mar 2021 16:28:07 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Shakeel Butt <shakeelb@...gle.com>,
        Oscar Salvador <osalvador@...e.de>,
        David Hildenbrand <david@...hat.com>,
        Muchun Song <songmuchun@...edance.com>,
        David Rientjes <rientjes@...gle.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Matthew Wilcox <willy@...radead.org>,
        HORIGUCHI NAOYA <naoya.horiguchi@....com>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.ibm.com>,
        Waiman Long <longman@...hat.com>, Peter Xu <peterx@...hat.com>,
        Mina Almasry <almasrymina@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC PATCH 5/8] hugetlb: change free_pool_huge_page to
 remove_pool_huge_page

On 3/22/21 7:31 AM, Michal Hocko wrote:
> On Fri 19-03-21 15:42:06, Mike Kravetz wrote:
> [...]
>> @@ -2090,9 +2084,15 @@ static void return_unused_surplus_pages(struct hstate *h,
>>  	while (nr_pages--) {
>>  		h->resv_huge_pages--;
>>  		unused_resv_pages--;
>> -		if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
>> +		page = remove_pool_huge_page(h, &node_states[N_MEMORY], 1);
>> +		if (!page)
>>  			goto out;
>> -		cond_resched_lock(&hugetlb_lock);
>> +
>> +		/* Drop lock and free page to buddy as it could sleep */
>> +		spin_unlock(&hugetlb_lock);
>> +		update_and_free_page(h, page);
>> +		cond_resched();
>> +		spin_lock(&hugetlb_lock);
>>  	}
>>  
>>  out:
> 
> This is likely a matter of taste but the repeated pattern of unlock,
> update_and_free_page, cond_resched and lock seems rather clumsy.
> Would it be slightly better/nicer to remove_pool_huge_page into a
> list_head under a single lock invocation and then free up the whole lot
> after the lock is dropped?

Yes, we can certainly do that.
One downside I see is that the list can contain a bunch of pages not
accounted for in hugetlb and not free in buddy (or cma).  Ideally, we
would want to keep those in sync if possible.  Also, the commit that
added the cond_resched talked about freeing up 12 TB worth of huge pages
and it holding the lock for 150 seconds.  The new code is not holding
the lock while calling free to buddy, but I wonder how long it would
take to remove 12 TB worth of huge pages and add them to a separate list?

I do not know how realistic the 12 TB number is.  But, I certainly am
aware of pools that are a few TB in size.
-- 
Mike Kravetz