linux-kernel - Re: [linus:master] [migrate_pages] 7e12beb8ca: vm-scalability.throughput -3.4% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Thu, 23 Mar 2023 09:53:23 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     "Liu, Yujie" <yujie.liu@...el.com>
Cc:     lkp <lkp@...el.com>, "bharata@....com" <bharata@....com>,
        "Yin, Fengwei" <fengwei.yin@...el.com>,
        "willy@...radead.org" <willy@...radead.org>,
        "mike.kravetz@...cle.com" <mike.kravetz@...cle.com>,
        "shy828301@...il.com" <shy828301@...il.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "xhao@...ux.alibaba.com" <xhao@...ux.alibaba.com>,
        "Tang, Feng" <feng.tang@...el.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "oe-lkp@...ts.linux.dev" <oe-lkp@...ts.linux.dev>,
        "ziy@...dia.com" <ziy@...dia.com>,
        "zhengjun.xing@...ux.intel.com" <zhengjun.xing@...ux.intel.com>,
        "osalvador@...e.de" <osalvador@...e.de>,
        "baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
        "minchan@...nel.org" <minchan@...nel.org>,
        "42.hyeyoo@...il.com" <42.hyeyoo@...il.com>,
        "apopple@...dia.com" <apopple@...dia.com>
Subject: Re: [linus:master] [migrate_pages] 7e12beb8ca:
 vm-scalability.throughput -3.4% regression

"Liu, Yujie" <yujie.liu@...el.com> writes:

> On Tue, 2023-03-21 at 13:43 +0800, Huang, Ying wrote:
>> "Liu, Yujie" <yujie.liu@...el.com> writes:
>>
>> > Hi Ying,
>> >
>> > On Mon, 2023-03-20 at 15:58 +0800, Huang, Ying wrote:
>> > > Hi, Yujie,
>> > >
>> > > kernel test robot <yujie.liu@...el.com> writes:
>> > >
>> > > > Hello,
>> > > >
>> > > > FYI, we noticed a -3.4% regression of vm-scalability.throughput due to commit:
>> > > >
>> > > > commit: 7e12beb8ca2ac98b2ec42e0ea4b76cdc93b58654 ("migrate_pages: batch flushing TLB")
>> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> > > >
>> > > > in testcase: vm-scalability
>> > > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
>> > > > with following parameters:
>> > > >
>> > > >         runtime: 300s
>> > > >         size: 512G
>> > > >         test: anon-cow-rand-mt
>> > > >         cpufreq_governor: performance
>> > > >
>> > > > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
>> > > > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
>> > > >
>> > > >
>> > > > If you fix the issue, kindly add following tag
>> > > > > Reported-by: kernel test robot <yujie.liu@...el.com>
>> > > > > Link: https://lore.kernel.org/oe-lkp/202303192325.ecbaf968-yujie.liu@intel.com
>> > > >
>> > >
>> > > Thanks a lot for report!  Can you try whether the debug patch as
>> > > below can restore the regression?
>> >
>> > We've tested the patch and found the throughput score was partially
>> > restored from -3.6% to -1.4%, still with a slight performance drop.
>> > Please check the detailed data as follows:
>>
>> Good!  Thanks for your detailed data!
>>
>> >       0.09 ± 17%      +1.2        1.32 ±  7%      +0.4        0.45 ± 21%  perf-profile.children.cycles-pp.flush_tlb_func
>>
>> It appears that we can reduce the unnecessary TLB flushing effectively
>> with the previous debug patch.  But the batched flush (full flush) is
>> still slower than the non-batched flush (flush one page).
>>
>> Can you try the debug patch as below to check whether it can restore the
>> regression completely?  The new debug patch can be applied on top of the
>> previous debug patch.
>
> The second debug patch got a -0.7% performance change. The data have
> some fluctuations from test to test, and the standard deviation is even
> a bit larger than 0.7%, which make the performance score not very
> convincing. Please check other metrics to see if the regression is
> fully restored. Thanks.

Thanks for testing!

>       0.09 ± 17%      +0.4        0.45 ± 21%      +0.0        0.09 ± 12%  perf-profile.children.cycles-pp.flush_tlb_func

>From the profiling data, the TLB flushing overhead has been restored.
So I think the remaining 0.7% regression should be at noise level.  I
will prepare the fixing patch based on the test results.

Best Regards,
Huang, Ying