[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiSHHSuSQsCCLOxQA+cbcvjmEeMsTCMWPT1sFVngd9-ig@mail.gmail.com>
Date: Tue, 10 Aug 2021 19:59:53 -1000
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: kernel test robot <oliver.sang@...el.com>
Cc: Johannes Weiner <hannes@...xchg.org>, Roman Gushchin <guro@...com>,
Michal Hocko <mhocko@...e.com>,
Shakeel Butt <shakeelb@...gle.com>,
Michal Koutný <mkoutny@...e.com>,
Balbir Singh <bsingharora@...il.com>,
Tejun Heo <tj@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
kernel test robot <lkp@...el.com>,
"Huang, Ying" <ying.huang@...el.com>,
Feng Tang <feng.tang@...el.com>,
Zhengjun Xing <zhengjun.xing@...ux.intel.com>
Subject: Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression
On Tue, Aug 10, 2021 at 4:59 PM kernel test robot <oliver.sang@...el.com> wrote:
>
> FYI, we noticed a -36.4% regression of vm-scalability.throughput due to commit:
> 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
Hmm. I guess some cost is to be expected, but that's a big regression.
I'm not sure what the code ends up doing, and how relevant this test
is, but Johannes - could you please take a look?
I can't make heads nor tails of the profile. The profile kind of points at this:
> 2.77 ą 12% +27.4 30.19 ą 8% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> 2.86 ą 12% +27.4 30.29 ą 8% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> 2.77 ą 12% +27.4 30.21 ą 8% perf-profile.children.cycles-pp.lock_page_lruvec_irqsave
> 4.26 ą 10% +28.1 32.32 ą 7% perf-profile.children.cycles-pp.lru_cache_add
> 4.15 ą 10% +28.2 32.33 ą 7% perf-profile.children.cycles-pp.__pagevec_lru_add
and that seems to be from the chain __do_fault -> shmem_fault ->
shmem_getpage_gfp -> lru_cache_add -> __pagevec_lru_add ->
lock_page_lruvec_irqsave -> _raw_spin_lock_irqsave ->
native_queued_spin_lock_slowpath.
That shmem_fault codepath being hot may make sense for sokme VM
scalability test. But it seems to make little sense when I look at the
commit that it bisected to.
We had another report of this commit causing a much more reasonable
small slowdown (-2.8%) back in May.
I'm not sure what's up with this new report. Johannes, does this make
sense to you?
Is it perhaps another "unlucky cache line placement" thing? Or has the
statistics changes just changed the behavior of the test?
Anybody?
Linus
Powered by blists - more mailing lists