linux-kernel - Re: [LKP] Re: [hugetlbfs] c0d0381ade: vm-scalability.throughput -33.4% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d945497d-0edb-f540-33e1-8b1ba1e20f62@linux.intel.com>
Date:   Fri, 21 Aug 2020 16:39:11 +0800
From:   Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>,
        kernel test robot <rong.a.chen@...el.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...nel.org>,
        Hugh Dickins <hughd@...gle.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A.Shutemov" <kirill.shutemov@...ux.intel.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Prakash Sangappa <prakash.sangappa@...cle.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: Re: [LKP] Re: [hugetlbfs] c0d0381ade: vm-scalability.throughput
 -33.4% regression



On 6/26/2020 5:33 AM, Mike Kravetz wrote:
> On 6/22/20 3:01 PM, Mike Kravetz wrote:
>> On 6/21/20 5:55 PM, kernel test robot wrote:
>>> Greeting,
>>>
>>> FYI, we noticed a -33.4% regression of vm-scalability.throughput due to commit:
>>>
>>>
>>> commit: c0d0381ade79885c04a04c303284b040616b116e ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>>
>>> in testcase: vm-scalability
>>> on test machine: 288 threads Intel(R) Xeon Phi(TM) CPU 7295 @ 1.50GHz with 80G memory
>>> with following parameters:
>>>
>>> 	runtime: 300s
>>> 	size: 8T
>>> 	test: anon-cow-seq-hugetlb
>>> 	cpufreq_governor: performance
>>> 	ucode: 0x11
>>>
>>
>> Some performance regression is not surprising as the change includes acquiring
>> and holding the i_mmap_rwsem (in read mode) during hugetlb page faults.  33.4%
>> seems a bit high.  But, the test is primarily exercising the hugetlb page
>> fault path and little else.
>>
>> The reason for taking the i_mmap_rwsem is to prevent PMD unsharing from
>> invalidating the pmd we are operating on.  This specific test case is operating
>> on anonymous private mappings.  So, PMD sharing is not possible and we can
>> eliminate acquiring the mutex in this case.  In fact, we should check all
>> mappings (even sharable) for the possibly of PMD sharing and only take the
>> mutex if necessary.  It will make the code a bit uglier, but will take care
>> of some of these regressions.  We still need to take the mutex in the case
>> of PMD sharing.  I'm afraid a regression is unavoidable in that case.
>>
>> I'll put together a patch.
> 
> Not acquiring the mutex on faults when sharing is not possible is quite
> straight forward.  We can even use the existing routine vma_shareable()
> to easily check.  However, the next patch in the series 87bf91d39bb5
> "hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race" depends
> on always acquiring the mutex.  If we break this assumption, then the
> code to back out hugetlb reservations needs to be written.  A high level
> view of what needs to be done is in the commit message for 87bf91d39bb5.
> 
> I'm working on the code to back out reservations.
> 

I find that 34ae204f18519f0920bd50a644abd6fefc8dbfcf(hugetlbfs: remove 
call to huge_pte_alloc without i_mmap_rwsem) fixed this regression, I 
test with the patch, the regression reduced to 10.1%, do you have plan 
to continue to improve it? Thanks.

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/size/test/cpufreq_governor/ucode:
 
lkp-knm01/vm-scalability/debian-x86_64-20191114.cgz/x86_64-rhel-7.6/gcc-7/300s/8T/anon-cow-seq-hugetlb/performance/0x11

commit:
   49aef7175cc6eb703a9280a7b830e675fe8f2704
   c0d0381ade79885c04a04c303284b040616b116e
   v5.8
   34ae204f18519f0920bd50a644abd6fefc8dbfcf
   v5.9-rc1

49aef7175cc6eb70 c0d0381ade79885c04a04c30328                        v5.8 
34ae204f18519f0920bd50a644a                    v5.9-rc1
---------------- --------------------------- --------------------------- 
--------------------------- ---------------------------
          %stddev     %change         %stddev     %change 
%stddev     %change         %stddev     %change         %stddev
              \          |                \          |                \ 
         |                \          |                \
      38084           -31.1%      26231 ±  2%     -26.6%      27944 ± 
5%      -7.0%      35405            -7.5%      35244 
vm-scalability.median
       9.92 ±  9%     +12.0       21.95 ±  4%      +3.9       13.87 ± 
30%      -5.3        4.66 ±  9%      -6.6        3.36 ±  7% 
vm-scalability.median_stddev%
   12827311           -35.0%    8340256 ±  2%     -30.9%    8865669 ± 
5%     -10.1%   11532087           -10.2%   11513595 ±  2% 
vm-scalability.throughput
  2.507e+09           -22.7%  1.938e+09           -15.3%  2.122e+09 ± 
6%      +8.0%  2.707e+09            +8.0%  2.707e+09 ±  2% 
vm-scalability.workload



-- 
Zhengjun Xing