lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 9 May 2023 18:34:35 -0400
From:   "Liam R. Howlett" <Liam.Howlett@...cle.com>
To:     Yin Fengwei <fengwei.yin@...el.com>
Cc:     kernel test robot <yujie.liu@...el.com>, oe-lkp@...ts.linux.dev,
        lkp@...el.com, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Yu Zhao <yuzhao@...gle.com>, linux-mm@...ck.org,
        ying.huang@...el.com, feng.tang@...el.com
Subject: Re: [linus:master] [mm/mmap] 28c5609fb2: aim9.page_test.ops_per_sec
 -10.8% regression

* Yin Fengwei <fengwei.yin@...el.com> [230509 02:56]:
> Hi Liam,
> 
> On 5/6/23 14:20, kernel test robot wrote:
> > Hello,
> > 
> > kernel test robot noticed a -10.8% regression of aim9.page_test.ops_per_sec on:
> > 
> > commit: 28c5609fb236807910ca347ad3e26c4567998526 ("mm/mmap: preallocate maple nodes for brk vma expansion")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > testcase: aim9
> > test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
> > parameters:
> > 
> > 	testtime: 5s
> > 	test: all
> > 	cpufreq_governor: performance
> > 
> > If you fix the issue, kindly add following tag
> > | Reported-by: kernel test robot <yujie.liu@...el.com>
> > | Link: https://lore.kernel.org/oe-lkp/202305061457.ac15990c-yujie.liu@intel.com
> > 
> 
> Some finding related:
>    eBPF funclatency tool says the latency of function do_brk_flags() doubles
>    with the patch 28c5609fb2.
> 
>    With the patch 28c5609fb2, the mas_alloc_nodes() is called much more than
>    without the patch.

Thank you for the insight into this test.

Right, so this is patch adds the call to preallocate nodes for the worst
case possible.  That certainly explains why you see so many more calls
to allocate nodes - it was meant to do just that.

> 
>    In my local debugging env, I can see around 17009999 times call to 
>    mas_alloc_nodes(). The number is zero without the patch 28c5609fb2.
> So we are kind of sure the regression is connected to the patch.
> 
> 
> The page_test of AIM9 is doing following work with single thread:
>         newbrk = sbrk(1024 * 1024);     /* move up 1 megabyte */                
>         while (true) {                   /* while not done */                    
>                 newbrk = sbrk(-4096 * 16);      /* deallocate some space */     
>                 for (i = 0; i < 16; i++) {      /* now get it back in pieces */
>                         newbrk = sbrk(4096);    /* Get pointer to new space */ 
> 		}
> 	}
> 
> Is it possible that the sbrk pattern triggers the corner case? Thanks.

I appreciate the analysis and the pointer to the allocation code.  This
has shown up somewhere else and I'm working on reducing the
preallocations.  This regression seems to be hidden, sometimes at least,
by the kmem_cache.

Regards,
Liam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ