linux-kernel - Re: [RFC V2] mm: add the zero case to page[1].compound_nr in set_compound

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <fb2fb91c-1e54-a4ee-bf69-299e9114ae1e@oracle.com>
Date:   Tue, 13 Dec 2022 22:38:15 -0800
From:   Sidhartha Kumar <sidhartha.kumar@...cle.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>,
        Nico Pache <npache@...hat.com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        muchun.song@...ux.dev, akpm@...ux-foundation.org,
        willy@...radead.org, gerald.schaefer@...ux.ibm.com
Subject: Re: [RFC V2] mm: add the zero case to page[1].compound_nr in
 set_compound_order

On 12/13/22 5:02 PM, Mike Kravetz wrote:
> On 12/13/22 17:27, Nico Pache wrote:
>> According to the document linked the following approach is even faster
>> than the one I used due to CPU parallelization:
> 
> I do not think we are very concerned with speed here.  This routine is being
> called in the creation of compound pages, and in the case of hugetlb the
> tear down of gigantic pages.  In general, creation and tear down of gigantic
> pages happens infrequently.  Usually only at system/application startup and
> system/application shutdown.
> 
Hi Nico,

I wrote a bpftrace script to track the time spent in 
__prep_compound_gigantic_folio both with and without the branch in 
folio_set_order() and resulting histogram was the same for both 
versions. This is probably because the for loop through every base page 
has a much higher overhead than the singular call to folio_set_order(). 
I am not sure what the performance difference for THP would be.

@prep_nsecs:
[1M, 2M) 
50|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|


Below is the script.

Thanks,
Sidhartha Kumar

k:__prep_compound_gigantic_folio
{
         @prep_start[pid] = nsecs;
}

kr:__prep_compound_gigantic_folio
{
         @prep_nsecs = hist((nsecs - @prep_start[pid]));
         delete(@prep_start[pid]);
}

> I think the only case where we 'might' be concerned with speed is in the
> creation of compound pages for THP.  Do note that this code path is
> still using set_compound_order as it has not been converted to folios.